Overview

Dataset statistics

Number of variables45
Number of observations407684
Missing cells3871070
Missing cells (%)21.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory159.2 MiB
Average record size in memory409.5 B

Variable types

Numeric10
Categorical34
Boolean1

Alerts

ori has constant value "CA0371100"Constant
agency has constant value "SD"Constant
gendnc_code has constant value "5.0"Constant
id has a high cardinality: 407684 distinct valuesHigh cardinality
date has a high cardinality: 912 distinct valuesHigh cardinality
time has a high cardinality: 77771 distinct valuesHigh cardinality
inters has a high cardinality: 15939 distinct valuesHigh cardinality
street has a high cardinality: 44668 distinct valuesHigh cardinality
hw_exit has a high cardinality: 2211 distinct valuesHigh cardinality
school_name has a high cardinality: 85 distinct valuesHigh cardinality
beat_name has a high cardinality: 127 distinct valuesHigh cardinality
disability has a high cardinality: 134 distinct valuesHigh cardinality
reason_text has a high cardinality: 1697 distinct valuesHigh cardinality
reason_detail has a high cardinality: 282 distinct valuesHigh cardinality
reason_exp has a high cardinality: 183583 distinct valuesHigh cardinality
search_basis has a high cardinality: 721 distinct valuesHigh cardinality
search_basis_exp has a high cardinality: 28990 distinct valuesHigh cardinality
prop_type has a high cardinality: 490 distinct valuesHigh cardinality
cont has a high cardinality: 669 distinct valuesHigh cardinality
actions has a high cardinality: 11672 distinct valuesHigh cardinality
act_consent has a high cardinality: 335 distinct valuesHigh cardinality
is_serv is highly imbalanced (50.6%)Imbalance
assign_words is highly imbalanced (84.0%)Imbalance
is_school is highly imbalanced (99.1%)Imbalance
city is highly imbalanced (96.7%)Imbalance
is_student is highly imbalanced (99.5%)Imbalance
lim_eng is highly imbalanced (86.4%)Imbalance
gender_words is highly imbalanced (56.8%)Imbalance
is_gendnc is highly imbalanced (99.5%)Imbalance
gender_code is highly imbalanced (62.7%)Imbalance
lgbt is highly imbalanced (82.4%)Imbalance
disability is highly imbalanced (95.0%)Imbalance
reason_words is highly imbalanced (56.3%)Imbalance
reason_detail is highly imbalanced (65.7%)Imbalance
search_basis is highly imbalanced (69.5%)Imbalance
cont is highly imbalanced (91.4%)Imbalance
actions is highly imbalanced (70.9%)Imbalance
act_consent is highly imbalanced (64.6%)Imbalance
inters has 366868 (90.0%) missing valuesMissing
block has 43330 (10.6%) missing valuesMissing
ldmk has 407643 (> 99.9%) missing valuesMissing
street has 16834 (4.1%) missing valuesMissing
hw_exit has 404618 (99.2%) missing valuesMissing
school_name has 407362 (99.9%) missing valuesMissing
gendnc_code has 407507 (> 99.9%) missing valuesMissing
reasonid has 18844 (4.6%) missing valuesMissing
reason_text has 18844 (4.6%) missing valuesMissing
reason_detail has 18838 (4.6%) missing valuesMissing
search_basis has 321160 (78.8%) missing valuesMissing
search_basis_exp has 344258 (84.4%) missing valuesMissing
seiz_basis has 398568 (97.8%) missing valuesMissing
prop_type has 398568 (97.8%) missing valuesMissing
act_consent has 297641 (73.0%) missing valuesMissing
block is highly skewed (γ1 = 254.9436976)Skewed
id is uniformly distributedUniform
ldmk is uniformly distributedUniform
id has unique valuesUnique

Reproduction

Analysis started2023-04-28 21:31:57.474986
Analysis finished2023-04-28 21:32:28.554789
Duration31.08 seconds
Software versionpandas-profiling v3.6.6
Download configurationconfig.json

Variables

Unnamed: 0
Real number (ℝ)

Distinct187251
Distinct (%)45.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean76803.876
Minimum1
Maximum187251
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.2 MiB
2023-04-28T17:32:28.594048image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile6795.15
Q133974
median67948
Q3117975
95-th percentile166866.85
Maximum187251
Range187250
Interquartile range (IQR)84001

Descriptive statistics

Standard deviation50412.731
Coefficient of variation (CV)0.65638264
Kurtosis-0.97628217
Mean76803.876
Median Absolute Deviation (MAD)40395
Skewness0.35472575
Sum3.1311711 × 1010
Variance2.5414434 × 109
MonotonicityNot monotonic
2023-04-28T17:32:28.658031image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 3
 
< 0.1%
46542 3
 
< 0.1%
46548 3
 
< 0.1%
46547 3
 
< 0.1%
46546 3
 
< 0.1%
46545 3
 
< 0.1%
46544 3
 
< 0.1%
46543 3
 
< 0.1%
46541 3
 
< 0.1%
46635 3
 
< 0.1%
Other values (187241) 407654
> 99.9%
ValueCountFrequency (%)
1 3
< 0.1%
2 3
< 0.1%
3 3
< 0.1%
4 3
< 0.1%
5 3
< 0.1%
6 3
< 0.1%
7 3
< 0.1%
8 3
< 0.1%
9 3
< 0.1%
10 3
< 0.1%
ValueCountFrequency (%)
187251 1
< 0.1%
187250 1
< 0.1%
187249 1
< 0.1%
187248 1
< 0.1%
187247 1
< 0.1%
187246 1
< 0.1%
187245 1
< 0.1%
187244 1
< 0.1%
187243 1
< 0.1%
187242 1
< 0.1%

stop_id
Real number (ℝ)

Distinct353547
Distinct (%)86.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean269011.35
Minimum84362
Maximum449933
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.2 MiB
2023-04-28T17:32:28.722216image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum84362
5-th percentile106360.15
Q1177796.75
median269751.5
Q3359576.25
95-th percentile431524.85
Maximum449933
Range365571
Interquartile range (IQR)181779.5

Descriptive statistics

Standard deviation104491.5
Coefficient of variation (CV)0.38842786
Kurtosis-1.1966487
Mean269011.35
Median Absolute Deviation (MAD)90905
Skewness-0.006532643
Sum1.0967162 × 1011
Variance1.0918474 × 1010
MonotonicityNot monotonic
2023-04-28T17:32:28.781265image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
174011 52
 
< 0.1%
184085 48
 
< 0.1%
180326 46
 
< 0.1%
169932 42
 
< 0.1%
183655 40
 
< 0.1%
161095 39
 
< 0.1%
174472 38
 
< 0.1%
236965 35
 
< 0.1%
170316 34
 
< 0.1%
169927 32
 
< 0.1%
Other values (353537) 407278
99.9%
ValueCountFrequency (%)
84362 1
< 0.1%
84364 1
< 0.1%
84365 1
< 0.1%
84366 1
< 0.1%
84369 1
< 0.1%
84370 1
< 0.1%
84371 1
< 0.1%
84372 2
< 0.1%
84373 1
< 0.1%
84374 1
< 0.1%
ValueCountFrequency (%)
449933 1
 
< 0.1%
449726 1
 
< 0.1%
449716 1
 
< 0.1%
449709 1
 
< 0.1%
449701 1
 
< 0.1%
449694 1
 
< 0.1%
449693 2
< 0.1%
449692 1
 
< 0.1%
449687 3
< 0.1%
449675 1
 
< 0.1%

pid
Real number (ℝ)

Distinct52
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.2621442
Minimum1
Maximum52
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.2 MiB
2023-04-28T17:32:28.841583image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q31
95-th percentile2
Maximum52
Range51
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.2245322
Coefficient of variation (CV)0.9701999
Kurtosis338.29907
Mean1.2621442
Median Absolute Deviation (MAD)0
Skewness14.708297
Sum514556
Variance1.4994791
MonotonicityNot monotonic
2023-04-28T17:32:28.900138image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 353540
86.7%
2 35536
 
8.7%
3 9722
 
2.4%
4 3793
 
0.9%
5 1715
 
0.4%
6 889
 
0.2%
7 531
 
0.1%
8 360
 
0.1%
9 257
 
0.1%
10 210
 
0.1%
Other values (42) 1131
 
0.3%
ValueCountFrequency (%)
1 353540
86.7%
2 35536
 
8.7%
3 9722
 
2.4%
4 3793
 
0.9%
5 1715
 
0.4%
6 889
 
0.2%
7 531
 
0.1%
8 360
 
0.1%
9 257
 
0.1%
10 210
 
0.1%
ValueCountFrequency (%)
52 1
 
< 0.1%
51 1
 
< 0.1%
50 1
 
< 0.1%
49 1
 
< 0.1%
48 2
< 0.1%
47 2
< 0.1%
46 3
< 0.1%
45 3
< 0.1%
44 3
< 0.1%
43 3
< 0.1%

id
Categorical

HIGH CARDINALITY  UNIFORM  UNIQUE 

Distinct407684
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size6.2 MiB
84362_1
 
1
329041_1
 
1
329039_1
 
1
329038_1
 
1
329037_1
 
1
Other values (407679)
407679 

Length

Max length9
Median length8
Mean length7.9652083
Min length7

Characters and Unicode

Total characters3247288
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique407684 ?
Unique (%)100.0%

Sample

1st row84362_1
2nd row84364_1
3rd row84365_1
4th row84366_1
5th row84369_1

Common Values

ValueCountFrequency (%)
84362_1 1
 
< 0.1%
329041_1 1
 
< 0.1%
329039_1 1
 
< 0.1%
329038_1 1
 
< 0.1%
329037_1 1
 
< 0.1%
329036_3 1
 
< 0.1%
329036_2 1
 
< 0.1%
329036_1 1
 
< 0.1%
329035_1 1
 
< 0.1%
329034_1 1
 
< 0.1%
Other values (407674) 407674
> 99.9%

Length

2023-04-28T17:32:28.957928image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
84362_1 1
 
< 0.1%
84365_1 1
 
< 0.1%
84369_1 1
 
< 0.1%
84370_1 1
 
< 0.1%
84371_1 1
 
< 0.1%
84372_1 1
 
< 0.1%
84372_2 1
 
< 0.1%
84373_1 1
 
< 0.1%
84374_1 1
 
< 0.1%
84375_1 1
 
< 0.1%
Other values (407674) 407674
> 99.9%

Most occurring characters

ValueCountFrequency (%)
1 674216
20.8%
_ 407684
12.6%
2 353904
10.9%
3 332496
10.2%
4 269023
 
8.3%
9 209563
 
6.5%
0 205642
 
6.3%
8 203747
 
6.3%
6 198566
 
6.1%
5 196463
 
6.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2839604
87.4%
Connector Punctuation 407684
 
12.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 674216
23.7%
2 353904
12.5%
3 332496
11.7%
4 269023
 
9.5%
9 209563
 
7.4%
0 205642
 
7.2%
8 203747
 
7.2%
6 198566
 
7.0%
5 196463
 
6.9%
7 195984
 
6.9%
Connector Punctuation
ValueCountFrequency (%)
_ 407684
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 3247288
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 674216
20.8%
_ 407684
12.6%
2 353904
10.9%
3 332496
10.2%
4 269023
 
8.3%
9 209563
 
6.5%
0 205642
 
6.3%
8 203747
 
6.3%
6 198566
 
6.1%
5 196463
 
6.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3247288
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 674216
20.8%
_ 407684
12.6%
2 353904
10.9%
3 332496
10.2%
4 269023
 
8.3%
9 209563
 
6.5%
0 205642
 
6.3%
8 203747
 
6.3%
6 198566
 
6.1%
5 196463
 
6.1%

ori
Categorical

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size6.2 MiB
CA0371100
407684 

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters3669156
Distinct characters6
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCA0371100
2nd rowCA0371100
3rd rowCA0371100
4th rowCA0371100
5th rowCA0371100

Common Values

ValueCountFrequency (%)
CA0371100 407684
100.0%

Length

2023-04-28T17:32:29.007650image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-28T17:32:29.059813image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
ca0371100 407684
100.0%

Most occurring characters

ValueCountFrequency (%)
0 1223052
33.3%
1 815368
22.2%
C 407684
 
11.1%
A 407684
 
11.1%
3 407684
 
11.1%
7 407684
 
11.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2853788
77.8%
Uppercase Letter 815368
 
22.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1223052
42.9%
1 815368
28.6%
3 407684
 
14.3%
7 407684
 
14.3%
Uppercase Letter
ValueCountFrequency (%)
C 407684
50.0%
A 407684
50.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2853788
77.8%
Latin 815368
 
22.2%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1223052
42.9%
1 815368
28.6%
3 407684
 
14.3%
7 407684
 
14.3%
Latin
ValueCountFrequency (%)
C 407684
50.0%
A 407684
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3669156
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1223052
33.3%
1 815368
22.2%
C 407684
 
11.1%
A 407684
 
11.1%
3 407684
 
11.1%
7 407684
 
11.1%

agency
Categorical

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size6.2 MiB
SD
407684 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters815368
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSD
2nd rowSD
3rd rowSD
4th rowSD
5th rowSD

Common Values

ValueCountFrequency (%)
SD 407684
100.0%

Length

2023-04-28T17:32:29.097253image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-28T17:32:29.142627image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
sd 407684
100.0%

Most occurring characters

ValueCountFrequency (%)
S 407684
50.0%
D 407684
50.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 815368
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 407684
50.0%
D 407684
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 815368
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 407684
50.0%
D 407684
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 815368
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 407684
50.0%
D 407684
50.0%

exp_years
Real number (ℝ)

Distinct40
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.2757896
Minimum1
Maximum50
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.2 MiB
2023-04-28T17:32:29.182786image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median3
Q310
95-th percentile21
Maximum50
Range49
Interquartile range (IQR)9

Descriptive statistics

Standard deviation7.0895988
Coefficient of variation (CV)1.1296744
Kurtosis2.1497799
Mean6.2757896
Median Absolute Deviation (MAD)2
Skewness1.5884159
Sum2558539
Variance50.262411
MonotonicityNot monotonic
2023-04-28T17:32:29.237024image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=40)
ValueCountFrequency (%)
1 152249
37.3%
3 33347
 
8.2%
2 30587
 
7.5%
5 30179
 
7.4%
4 24049
 
5.9%
10 16650
 
4.1%
11 12187
 
3.0%
18 11768
 
2.9%
9 10901
 
2.7%
12 9835
 
2.4%
Other values (30) 75932
18.6%
ValueCountFrequency (%)
1 152249
37.3%
2 30587
 
7.5%
3 33347
 
8.2%
4 24049
 
5.9%
5 30179
 
7.4%
6 9370
 
2.3%
7 4610
 
1.1%
8 5255
 
1.3%
9 10901
 
2.7%
10 16650
 
4.1%
ValueCountFrequency (%)
50 4
 
< 0.1%
49 23
 
< 0.1%
48 231
0.1%
45 33
 
< 0.1%
37 2
 
< 0.1%
35 1
 
< 0.1%
34 1
 
< 0.1%
33 35
 
< 0.1%
32 197
< 0.1%
31 88
 
< 0.1%

date
Categorical

Distinct912
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size6.2 MiB
2020-02-12
 
799
2019-05-23
 
793
2020-02-11
 
791
2019-07-06
 
755
2020-01-16
 
749
Other values (907)
403797 

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters4076840
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2019-01-01
2nd row2019-01-01
3rd row2019-01-01
4th row2019-01-01
5th row2019-01-01

Common Values

ValueCountFrequency (%)
2020-02-12 799
 
0.2%
2019-05-23 793
 
0.2%
2020-02-11 791
 
0.2%
2019-07-06 755
 
0.2%
2020-01-16 749
 
0.2%
2019-10-23 734
 
0.2%
2019-09-24 733
 
0.2%
2019-08-21 722
 
0.2%
2019-10-02 715
 
0.2%
2019-03-27 712
 
0.2%
Other values (902) 400181
98.2%

Length

2023-04-28T17:32:29.284490image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2020-02-12 799
 
0.2%
2019-05-23 793
 
0.2%
2020-02-11 791
 
0.2%
2019-07-06 755
 
0.2%
2020-01-16 749
 
0.2%
2019-10-23 734
 
0.2%
2019-09-24 733
 
0.2%
2019-08-21 722
 
0.2%
2019-10-02 715
 
0.2%
2019-03-27 712
 
0.2%
Other values (902) 400181
98.2%

Most occurring characters

ValueCountFrequency (%)
0 1073720
26.3%
2 865923
21.2%
- 815368
20.0%
1 591994
14.5%
9 254877
 
6.3%
3 102764
 
2.5%
5 80571
 
2.0%
4 79204
 
1.9%
6 74134
 
1.8%
7 69263
 
1.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3261472
80.0%
Dash Punctuation 815368
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1073720
32.9%
2 865923
26.6%
1 591994
18.2%
9 254877
 
7.8%
3 102764
 
3.2%
5 80571
 
2.5%
4 79204
 
2.4%
6 74134
 
2.3%
7 69263
 
2.1%
8 69022
 
2.1%
Dash Punctuation
ValueCountFrequency (%)
- 815368
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4076840
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1073720
26.3%
2 865923
21.2%
- 815368
20.0%
1 591994
14.5%
9 254877
 
6.3%
3 102764
 
2.5%
5 80571
 
2.0%
4 79204
 
1.9%
6 74134
 
1.8%
7 69263
 
1.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4076840
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1073720
26.3%
2 865923
21.2%
- 815368
20.0%
1 591994
14.5%
9 254877
 
6.3%
3 102764
 
2.5%
5 80571
 
2.0%
4 79204
 
1.9%
6 74134
 
1.8%
7 69263
 
1.7%

time
Categorical

Distinct77771
Distinct (%)19.1%
Missing0
Missing (%)0.0%
Memory size6.2 MiB
16:00:00
 
1122
10:00:00
 
982
08:00:00
 
976
15:00:00
 
976
11:00:00
 
941
Other values (77766)
402687 

Length

Max length19
Median length8
Mean length8.0024823
Min length8

Characters and Unicode

Total characters3262484
Distinct characters13
Distinct categories4 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique14534 ?
Unique (%)3.6%

Sample

1st row00:15:07
2nd row00:15:16
3rd row00:02:00
4th row00:38:00
5th row01:06:41

Common Values

ValueCountFrequency (%)
16:00:00 1122
 
0.3%
10:00:00 982
 
0.2%
08:00:00 976
 
0.2%
15:00:00 976
 
0.2%
11:00:00 941
 
0.2%
09:00:00 936
 
0.2%
22:00:00 914
 
0.2%
17:00:00 900
 
0.2%
15:30:00 817
 
0.2%
07:00:00 800
 
0.2%
Other values (77761) 398320
97.7%

Length

2023-04-28T17:32:29.328513image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
16:00:00 1122
 
0.3%
10:00:00 982
 
0.2%
08:00:00 976
 
0.2%
15:00:00 976
 
0.2%
11:00:00 941
 
0.2%
09:00:00 936
 
0.2%
22:00:00 914
 
0.2%
17:00:00 900
 
0.2%
15:30:00 817
 
0.2%
07:00:00 800
 
0.2%
Other values (77754) 398412
97.7%

Most occurring characters

ValueCountFrequency (%)
: 815368
25.0%
0 690766
21.2%
1 418785
12.8%
2 291617
 
8.9%
5 228173
 
7.0%
3 221424
 
6.8%
4 196101
 
6.0%
8 104305
 
3.2%
7 100548
 
3.1%
9 100195
 
3.1%
Other values (3) 95202
 
2.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2446840
75.0%
Other Punctuation 815368
 
25.0%
Dash Punctuation 184
 
< 0.1%
Space Separator 92
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 690766
28.2%
1 418785
17.1%
2 291617
11.9%
5 228173
 
9.3%
3 221424
 
9.0%
4 196101
 
8.0%
8 104305
 
4.3%
7 100548
 
4.1%
9 100195
 
4.1%
6 94926
 
3.9%
Other Punctuation
ValueCountFrequency (%)
: 815368
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 184
100.0%
Space Separator
ValueCountFrequency (%)
92
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 3262484
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
: 815368
25.0%
0 690766
21.2%
1 418785
12.8%
2 291617
 
8.9%
5 228173
 
7.0%
3 221424
 
6.8%
4 196101
 
6.0%
8 104305
 
3.2%
7 100548
 
3.1%
9 100195
 
3.1%
Other values (3) 95202
 
2.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3262484
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
: 815368
25.0%
0 690766
21.2%
1 418785
12.8%
2 291617
 
8.9%
5 228173
 
7.0%
3 221424
 
6.8%
4 196101
 
6.0%
8 104305
 
3.2%
7 100548
 
3.1%
9 100195
 
3.1%
Other values (3) 95202
 
2.9%

dur
Real number (ℝ)

Distinct337
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean28.579856
Minimum1
Maximum1440
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.2 MiB
2023-04-28T17:32:29.379017image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5
Q110
median15
Q330
95-th percentile120
Maximum1440
Range1439
Interquartile range (IQR)20

Descriptive statistics

Standard deviation49.791228
Coefficient of variation (CV)1.7421791
Kurtosis182.30022
Mean28.579856
Median Absolute Deviation (MAD)6
Skewness9.2495682
Sum11651550
Variance2479.1664
MonotonicityNot monotonic
2023-04-28T17:32:29.435554image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10 99603
24.4%
15 49427
12.1%
5 42258
10.4%
20 41432
10.2%
30 28548
 
7.0%
60 17833
 
4.4%
8 13319
 
3.3%
120 12718
 
3.1%
6 12384
 
3.0%
7 9335
 
2.3%
Other values (327) 80827
19.8%
ValueCountFrequency (%)
1 1035
 
0.3%
2 2654
 
0.7%
3 2755
 
0.7%
4 2327
 
0.6%
5 42258
10.4%
6 12384
 
3.0%
7 9335
 
2.3%
8 13319
 
3.3%
9 4340
 
1.1%
10 99603
24.4%
ValueCountFrequency (%)
1440 52
< 0.1%
1422 1
 
< 0.1%
1400 24
< 0.1%
1355 1
 
< 0.1%
1330 2
 
< 0.1%
1301 1
 
< 0.1%
1300 3
 
< 0.1%
1230 1
 
< 0.1%
1220 1
 
< 0.1%
1210 4
 
< 0.1%

is_serv
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size6.2 MiB
0
363639 
1
44045 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters407684
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row1

Common Values

ValueCountFrequency (%)
0 363639
89.2%
1 44045
 
10.8%

Length

2023-04-28T17:32:29.485906image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-28T17:32:29.531496image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
0 363639
89.2%
1 44045
 
10.8%

Most occurring characters

ValueCountFrequency (%)
0 363639
89.2%
1 44045
 
10.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 407684
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 363639
89.2%
1 44045
 
10.8%

Most occurring scripts

ValueCountFrequency (%)
Common 407684
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 363639
89.2%
1 44045
 
10.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 407684
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 363639
89.2%
1 44045
 
10.8%

assign_key
Real number (ℝ)

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.4390827
Minimum1
Maximum10
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.2 MiB
2023-04-28T17:32:29.567585image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q31
95-th percentile5
Maximum10
Range9
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.8204938
Coefficient of variation (CV)1.2650377
Kurtosis16.054601
Mean1.4390827
Median Absolute Deviation (MAD)0
Skewness4.1949149
Sum586691
Variance3.3141978
MonotonicityNot monotonic
2023-04-28T17:32:29.606343image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
1 378571
92.9%
10 13106
 
3.2%
2 7104
 
1.7%
9 3700
 
0.9%
5 1624
 
0.4%
7 1247
 
0.3%
6 802
 
0.2%
4 626
 
0.2%
8 535
 
0.1%
3 369
 
0.1%
ValueCountFrequency (%)
1 378571
92.9%
2 7104
 
1.7%
3 369
 
0.1%
4 626
 
0.2%
5 1624
 
0.4%
6 802
 
0.2%
7 1247
 
0.3%
8 535
 
0.1%
9 3700
 
0.9%
10 13106
 
3.2%
ValueCountFrequency (%)
10 13106
 
3.2%
9 3700
 
0.9%
8 535
 
0.1%
7 1247
 
0.3%
6 802
 
0.2%
5 1624
 
0.4%
4 626
 
0.2%
3 369
 
0.1%
2 7104
 
1.7%
1 378571
92.9%

assign_words
Categorical

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size6.2 MiB
Patrol, traffic enforcement, field operations
378571 
Other
 
13106
Gang enforcement
 
7104
Investigative/detective
 
3700
Roadblock or DUI sobriety checkpoint
 
1624
Other values (5)
 
3579

Length

Max length78
Median length45
Mean length42.774671
Min length5

Characters and Unicode

Total characters17438549
Distinct characters39
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPatrol, traffic enforcement, field operations
2nd rowPatrol, traffic enforcement, field operations
3rd rowPatrol, traffic enforcement, field operations
4th rowPatrol, traffic enforcement, field operations
5th rowPatrol, traffic enforcement, field operations

Common Values

ValueCountFrequency (%)
Patrol, traffic enforcement, field operations 378571
92.9%
Other 13106
 
3.2%
Gang enforcement 7104
 
1.7%
Investigative/detective 3700
 
0.9%
Roadblock or DUI sobriety checkpoint 1624
 
0.4%
Task force 1247
 
0.3%
Narcotics/vice 802
 
0.2%
Special events 626
 
0.2%
K1-12 public school inlcuding school resource officer or school police officer 535
 
0.1%
Compliance check 369
 
0.1%

Length

2023-04-28T17:32:29.659480image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-28T17:32:29.730305image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
enforcement 385675
19.8%
patrol 378571
19.5%
field 378571
19.5%
operations 378571
19.5%
traffic 378571
19.5%
other 13106
 
0.7%
gang 7104
 
0.4%
investigative/detective 3700
 
0.2%
or 2159
 
0.1%
roadblock 1624
 
0.1%
Other values (17) 15508
 
0.8%

Most occurring characters

ValueCountFrequency (%)
e 1956361
11.2%
t 1553970
8.9%
r 1542466
8.8%
o 1537811
8.8%
1535476
8.8%
f 1524775
8.7%
n 1164414
 
6.7%
i 1155870
 
6.6%
a 1151185
 
6.6%
c 783019
 
4.5%
Other values (29) 3533202
20.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 14726733
84.4%
Space Separator 1535476
 
8.8%
Other Punctuation 761644
 
4.4%
Uppercase Letter 412556
 
2.4%
Decimal Number 1605
 
< 0.1%
Dash Punctuation 535
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 1956361
13.3%
t 1553970
10.6%
r 1542466
10.5%
o 1537811
10.4%
f 1524775
10.4%
n 1164414
7.9%
i 1155870
7.8%
a 1151185
7.8%
c 783019
5.3%
l 762971
 
5.2%
Other values (11) 1593891
10.8%
Uppercase Letter
ValueCountFrequency (%)
P 378571
91.8%
O 13106
 
3.2%
G 7104
 
1.7%
I 5324
 
1.3%
R 1624
 
0.4%
D 1624
 
0.4%
U 1624
 
0.4%
T 1247
 
0.3%
N 802
 
0.2%
S 626
 
0.2%
Other values (2) 904
 
0.2%
Other Punctuation
ValueCountFrequency (%)
, 757142
99.4%
/ 4502
 
0.6%
Decimal Number
ValueCountFrequency (%)
1 1070
66.7%
2 535
33.3%
Space Separator
ValueCountFrequency (%)
1535476
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 535
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 15139289
86.8%
Common 2299260
 
13.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 1956361
12.9%
t 1553970
10.3%
r 1542466
10.2%
o 1537811
10.2%
f 1524775
10.1%
n 1164414
7.7%
i 1155870
7.6%
a 1151185
7.6%
c 783019
 
5.2%
l 762971
 
5.0%
Other values (23) 2006447
13.3%
Common
ValueCountFrequency (%)
1535476
66.8%
, 757142
32.9%
/ 4502
 
0.2%
1 1070
 
< 0.1%
- 535
 
< 0.1%
2 535
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 17438549
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 1956361
11.2%
t 1553970
8.9%
r 1542466
8.8%
o 1537811
8.8%
1535476
8.8%
f 1524775
8.7%
n 1164414
 
6.7%
i 1155870
 
6.6%
a 1151185
 
6.6%
c 783019
 
4.5%
Other values (29) 3533202
20.3%

inters
Categorical

HIGH CARDINALITY  MISSING 

Distinct15939
Distinct (%)39.1%
Missing366868
Missing (%)90.0%
Memory size6.2 MiB
BROADWAY
 
278
MIRAMAR WAY
 
250
CAMINO DE LA PLAZA/ CAMIONES WAY
 
222
OTAY VALLEY ROAD/ AVENIDA DE LAS VISTAS
 
145
G Street
 
137
Other values (15934)
39784 

Length

Max length77
Median length59
Mean length13.920178
Min length1

Characters and Unicode

Total characters568166
Distinct characters78
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique11202 ?
Unique (%)27.4%

Sample

1st rowgovernor dr
2nd rowla jolla village dr
3rd rowmission/hornblend
4th rowhornblend/mission blvd
5th rowclairemont mesa blvd

Common Values

ValueCountFrequency (%)
BROADWAY 278
 
0.1%
MIRAMAR WAY 250
 
0.1%
CAMINO DE LA PLAZA/ CAMIONES WAY 222
 
0.1%
OTAY VALLEY ROAD/ AVENIDA DE LAS VISTAS 145
 
< 0.1%
G Street 137
 
< 0.1%
imperial 129
 
< 0.1%
garnet 128
 
< 0.1%
w ash 127
 
< 0.1%
MARKET ST 108
 
< 0.1%
I-15 105
 
< 0.1%
Other values (15929) 39187
 
9.6%
(Missing) 366868
90.0%

Length

2023-04-28T17:32:29.808693image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
and 3985
 
3.7%
st 3619
 
3.4%
ave 3372
 
3.2%
3273
 
3.1%
street 2699
 
2.5%
beach 1622
 
1.5%
rd 1616
 
1.5%
mission 1598
 
1.5%
blvd 1541
 
1.4%
dr 1353
 
1.3%
Other values (4709) 81672
76.8%

Most occurring characters

ValueCountFrequency (%)
65653
 
11.6%
a 30420
 
5.4%
e 28576
 
5.0%
A 27394
 
4.8%
r 22026
 
3.9%
E 20689
 
3.6%
n 19236
 
3.4%
t 18707
 
3.3%
R 17141
 
3.0%
o 16782
 
3.0%
Other values (68) 301542
53.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 253618
44.6%
Uppercase Letter 218264
38.4%
Space Separator 65653
 
11.6%
Decimal Number 19086
 
3.4%
Other Punctuation 9867
 
1.7%
Dash Punctuation 1667
 
0.3%
Open Punctuation 5
 
< 0.1%
Close Punctuation 5
 
< 0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 30420
12.0%
e 28576
11.3%
r 22026
 
8.7%
n 19236
 
7.6%
t 18707
 
7.4%
o 16782
 
6.6%
i 16117
 
6.4%
s 14379
 
5.7%
l 13637
 
5.4%
d 13130
 
5.2%
Other values (16) 60608
23.9%
Uppercase Letter
ValueCountFrequency (%)
A 27394
12.6%
E 20689
 
9.5%
R 17141
 
7.9%
S 15712
 
7.2%
I 15083
 
6.9%
N 13468
 
6.2%
O 12558
 
5.8%
T 11194
 
5.1%
L 10226
 
4.7%
C 9970
 
4.6%
Other values (16) 64829
29.7%
Other Punctuation
ValueCountFrequency (%)
/ 8725
88.4%
. 577
 
5.8%
& 194
 
2.0%
@ 181
 
1.8%
, 116
 
1.2%
' 59
 
0.6%
! 5
 
0.1%
: 5
 
0.1%
# 3
 
< 0.1%
% 1
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
5 5996
31.4%
1 3466
18.2%
8 2088
 
10.9%
0 1848
 
9.7%
6 1280
 
6.7%
4 1200
 
6.3%
3 1120
 
5.9%
2 971
 
5.1%
9 627
 
3.3%
7 490
 
2.6%
Space Separator
ValueCountFrequency (%)
65653
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1667
100.0%
Open Punctuation
ValueCountFrequency (%)
( 5
100.0%
Close Punctuation
ValueCountFrequency (%)
) 5
100.0%
Math Symbol
ValueCountFrequency (%)
= 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 471882
83.1%
Common 96284
 
16.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 30420
 
6.4%
e 28576
 
6.1%
A 27394
 
5.8%
r 22026
 
4.7%
E 20689
 
4.4%
n 19236
 
4.1%
t 18707
 
4.0%
R 17141
 
3.6%
o 16782
 
3.6%
i 16117
 
3.4%
Other values (42) 254794
54.0%
Common
ValueCountFrequency (%)
65653
68.2%
/ 8725
 
9.1%
5 5996
 
6.2%
1 3466
 
3.6%
8 2088
 
2.2%
0 1848
 
1.9%
- 1667
 
1.7%
6 1280
 
1.3%
4 1200
 
1.2%
3 1120
 
1.2%
Other values (16) 3241
 
3.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 568166
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
65653
 
11.6%
a 30420
 
5.4%
e 28576
 
5.0%
A 27394
 
4.8%
r 22026
 
3.9%
E 20689
 
3.6%
n 19236
 
3.4%
t 18707
 
3.3%
R 17141
 
3.0%
o 16782
 
3.0%
Other values (68) 301542
53.1%

block
Real number (ℝ)

MISSING  SKEWED 

Distinct307
Distinct (%)0.1%
Missing43330
Missing (%)10.6%
Infinite0
Infinite (%)0.0%
Mean7028.819
Minimum0
Maximum99999900
Zeros133
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size6.2 MiB
2023-04-28T17:32:29.872650image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile200
Q11300
median3200
Q34800
95-th percentile9600
Maximum99999900
Range99999900
Interquartile range (IQR)3500

Descriptive statistics

Standard deviation321105.19
Coefficient of variation (CV)45.684089
Kurtosis77631.695
Mean7028.819
Median Absolute Deviation (MAD)1800
Skewness254.9437
Sum2.5609783 × 109
Variance1.0310854 × 1011
MonotonicityNot monotonic
2023-04-28T17:32:29.933295image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
100 11707
 
2.9%
700 9976
 
2.4%
3000 8781
 
2.2%
4000 8565
 
2.1%
1000 8233
 
2.0%
500 8060
 
2.0%
800 7831
 
1.9%
4200 7434
 
1.8%
4300 7320
 
1.8%
3800 7149
 
1.8%
Other values (297) 279298
68.5%
(Missing) 43330
 
10.6%
ValueCountFrequency (%)
0 133
 
< 0.1%
100 11707
2.9%
200 7056
1.7%
300 6640
1.6%
400 5851
1.4%
500 8060
2.0%
600 6673
1.6%
700 9976
2.4%
800 7831
1.9%
900 6361
1.6%
ValueCountFrequency (%)
99999900 3
 
< 0.1%
18007300 1
 
< 0.1%
9999900 70
 
< 0.1%
5600900 1
 
< 0.1%
999900 221
0.1%
520000 1
 
< 0.1%
180000 1
 
< 0.1%
154000 1
 
< 0.1%
147000 1
 
< 0.1%
140000 1
 
< 0.1%

ldmk
Categorical

MISSING  UNIFORM 

Distinct36
Distinct (%)87.8%
Missing407643
Missing (%)> 99.9%
Memory size6.2 MiB
15nb exit
North Cove Park Pacific beach
 
2
i805/43rd St
 
2
BALBOA PARK - SPANISH VILLAGE
 
1
NB I-15 AT AERO DRIVE
 
1
Other values (31)
31 

Length

Max length41
Median length29
Mean length19.390244
Min length8

Characters and Unicode

Total characters795
Distinct characters58
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique33 ?
Unique (%)80.5%

Sample

1st rowsr905 / i805
2nd rowNorth Cove Park Pacific beach
3rd rowNorth Cove Park Pacific beach
4th rowI15 / I8
5th rowON TROLLEY IN SANTEE

Common Values

ValueCountFrequency (%)
15nb exit 4
 
< 0.1%
North Cove Park Pacific beach 2
 
< 0.1%
i805/43rd St 2
 
< 0.1%
BALBOA PARK - SPANISH VILLAGE 1
 
< 0.1%
NB I-15 AT AERO DRIVE 1
 
< 0.1%
NORTHBOUND INTERSTATE-15/AERO DRIVE 1
 
< 0.1%
North Cove Public Beach 1
 
< 0.1%
Convention Center 1
 
< 0.1%
de anza cove 1
 
< 0.1%
NB I-15 @ BALBOA AVE 1
 
< 0.1%
Other values (26) 26
 
< 0.1%
(Missing) 407643
> 99.9%

Length

2023-04-28T17:32:29.990280image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
at 9
 
5.9%
8
 
5.3%
sb 6
 
3.9%
park 6
 
3.9%
balboa 5
 
3.3%
and 5
 
3.3%
i-15 5
 
3.3%
15nb 4
 
2.6%
nb 4
 
2.6%
exit 4
 
2.6%
Other values (66) 96
63.2%

Most occurring characters

ValueCountFrequency (%)
111
 
14.0%
A 42
 
5.3%
E 40
 
5.0%
T 32
 
4.0%
R 30
 
3.8%
B 29
 
3.6%
a 29
 
3.6%
I 27
 
3.4%
S 26
 
3.3%
N 25
 
3.1%
Other values (48) 404
50.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 361
45.4%
Lowercase Letter 218
27.4%
Space Separator 111
 
14.0%
Decimal Number 80
 
10.1%
Dash Punctuation 13
 
1.6%
Other Punctuation 12
 
1.5%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 42
11.6%
E 40
11.1%
T 32
8.9%
R 30
8.3%
B 29
 
8.0%
I 27
 
7.5%
S 26
 
7.2%
N 25
 
6.9%
O 22
 
6.1%
D 14
 
3.9%
Other values (12) 74
20.5%
Lowercase Letter
ValueCountFrequency (%)
a 29
13.3%
e 22
10.1%
n 18
 
8.3%
t 18
 
8.3%
o 17
 
7.8%
r 17
 
7.8%
i 14
 
6.4%
b 12
 
5.5%
c 11
 
5.0%
d 9
 
4.1%
Other values (12) 51
23.4%
Decimal Number
ValueCountFrequency (%)
5 25
31.2%
1 16
20.0%
0 8
 
10.0%
8 7
 
8.8%
4 6
 
7.5%
9 6
 
7.5%
3 5
 
6.2%
6 4
 
5.0%
2 2
 
2.5%
7 1
 
1.2%
Other Punctuation
ValueCountFrequency (%)
/ 7
58.3%
@ 5
41.7%
Space Separator
ValueCountFrequency (%)
111
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 13
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 579
72.8%
Common 216
 
27.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 42
 
7.3%
E 40
 
6.9%
T 32
 
5.5%
R 30
 
5.2%
B 29
 
5.0%
a 29
 
5.0%
I 27
 
4.7%
S 26
 
4.5%
N 25
 
4.3%
O 22
 
3.8%
Other values (34) 277
47.8%
Common
ValueCountFrequency (%)
111
51.4%
5 25
 
11.6%
1 16
 
7.4%
- 13
 
6.0%
0 8
 
3.7%
/ 7
 
3.2%
8 7
 
3.2%
4 6
 
2.8%
9 6
 
2.8%
3 5
 
2.3%
Other values (4) 12
 
5.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 795
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
111
 
14.0%
A 42
 
5.3%
E 40
 
5.0%
T 32
 
4.0%
R 30
 
3.8%
B 29
 
3.6%
a 29
 
3.6%
I 27
 
3.4%
S 26
 
3.3%
N 25
 
3.1%
Other values (48) 404
50.8%

street
Categorical

HIGH CARDINALITY  MISSING 

Distinct44668
Distinct (%)11.4%
Missing16834
Missing (%)4.1%
Memory size6.2 MiB
El Cajon Blvd
 
2488
el cajon blvd
 
1577
imperial ave
 
1551
imperial
 
1469
garnet
 
1367
Other values (44663)
382398 

Length

Max length43
Median length36
Mean length10.67244
Min length1

Characters and Unicode

Total characters4171323
Distinct characters82
Distinct categories11 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique24136 ?
Unique (%)6.2%

Sample

1st rowUNIVERSITY
2nd rowhillside dr
3rd rowocean blvd
4th rowgarnet
5th rowcoronado

Common Values

ValueCountFrequency (%)
El Cajon Blvd 2488
 
0.6%
el cajon blvd 1577
 
0.4%
imperial ave 1551
 
0.4%
imperial 1469
 
0.4%
garnet 1367
 
0.3%
university ave 1270
 
0.3%
University Ave 1240
 
0.3%
university 1221
 
0.3%
EL CAJON BLVD 1123
 
0.3%
commercial 1047
 
0.3%
Other values (44658) 376497
92.4%
(Missing) 16834
 
4.1%

Length

2023-04-28T17:32:30.049301image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
ave 47927
 
6.1%
st 39901
 
5.1%
street 37969
 
4.9%
blvd 26679
 
3.4%
avenue 17957
 
2.3%
rd 15152
 
1.9%
dr 12807
 
1.6%
mission 11774
 
1.5%
road 10459
 
1.3%
el 9610
 
1.2%
Other values (10930) 549297
70.5%

Most occurring characters

ValueCountFrequency (%)
389073
 
9.3%
e 299210
 
7.2%
a 257248
 
6.2%
t 210603
 
5.0%
r 205794
 
4.9%
n 163322
 
3.9%
A 155793
 
3.7%
o 151343
 
3.6%
i 143971
 
3.5%
l 135009
 
3.2%
Other values (72) 2059957
49.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2348150
56.3%
Uppercase Letter 1323188
31.7%
Space Separator 389073
 
9.3%
Decimal Number 99771
 
2.4%
Other Punctuation 9179
 
0.2%
Dash Punctuation 1695
 
< 0.1%
Open Punctuation 110
 
< 0.1%
Close Punctuation 108
 
< 0.1%
Modifier Symbol 46
 
< 0.1%
Math Symbol 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 299210
12.7%
a 257248
11.0%
t 210603
 
9.0%
r 205794
 
8.8%
n 163322
 
7.0%
o 151343
 
6.4%
i 143971
 
6.1%
l 135009
 
5.7%
s 132042
 
5.6%
v 100060
 
4.3%
Other values (16) 549548
23.4%
Uppercase Letter
ValueCountFrequency (%)
A 155793
11.8%
E 128833
 
9.7%
R 111686
 
8.4%
S 109975
 
8.3%
T 83749
 
6.3%
N 75814
 
5.7%
I 72833
 
5.5%
O 69409
 
5.2%
L 68521
 
5.2%
D 62917
 
4.8%
Other values (16) 383658
29.0%
Other Punctuation
ValueCountFrequency (%)
. 7202
78.5%
/ 1209
 
13.2%
& 330
 
3.6%
# 182
 
2.0%
@ 95
 
1.0%
' 70
 
0.8%
, 54
 
0.6%
: 22
 
0.2%
; 11
 
0.1%
\ 2
 
< 0.1%
Other values (2) 2
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 19805
19.9%
5 15134
15.2%
4 14014
14.0%
3 10710
10.7%
0 9419
9.4%
6 9332
9.4%
7 6912
 
6.9%
2 6662
 
6.7%
8 4663
 
4.7%
9 3120
 
3.1%
Open Punctuation
ValueCountFrequency (%)
( 108
98.2%
[ 2
 
1.8%
Space Separator
ValueCountFrequency (%)
389073
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1695
100.0%
Close Punctuation
ValueCountFrequency (%)
) 108
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 46
100.0%
Math Symbol
ValueCountFrequency (%)
= 2
100.0%
Currency Symbol
ValueCountFrequency (%)
$ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3671338
88.0%
Common 499985
 
12.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 299210
 
8.1%
a 257248
 
7.0%
t 210603
 
5.7%
r 205794
 
5.6%
n 163322
 
4.4%
A 155793
 
4.2%
o 151343
 
4.1%
i 143971
 
3.9%
l 135009
 
3.7%
s 132042
 
3.6%
Other values (42) 1817003
49.5%
Common
ValueCountFrequency (%)
389073
77.8%
1 19805
 
4.0%
5 15134
 
3.0%
4 14014
 
2.8%
3 10710
 
2.1%
0 9419
 
1.9%
6 9332
 
1.9%
. 7202
 
1.4%
7 6912
 
1.4%
2 6662
 
1.3%
Other values (20) 11722
 
2.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4171323
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
389073
 
9.3%
e 299210
 
7.2%
a 257248
 
6.2%
t 210603
 
5.0%
r 205794
 
4.9%
n 163322
 
3.9%
A 155793
 
3.7%
o 151343
 
3.6%
i 143971
 
3.5%
l 135009
 
3.2%
Other values (72) 2059957
49.4%

hw_exit
Categorical

HIGH CARDINALITY  MISSING 

Distinct2211
Distinct (%)72.1%
Missing404618
Missing (%)99.2%
Memory size6.2 MiB
NB I-15
 
36
I-805/PLAZA BOULEVARD
 
31
I-805/SR-54
 
29
SR 905
 
28
SB I-15
 
26
Other values (2206)
2916 

Length

Max length60
Median length44
Mean length18.403783
Min length2

Characters and Unicode

Total characters56426
Distinct characters74
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1907 ?
Unique (%)62.2%

Sample

1st rown/b 5 @ sea world
2nd rowI15NB @ AERO DR
3rd rowwB 8 @ WARING
4th row15 AT MIRAMAR
5th row15 AT 163

Common Values

ValueCountFrequency (%)
NB I-15 36
 
< 0.1%
I-805/PLAZA BOULEVARD 31
 
< 0.1%
I-805/SR-54 29
 
< 0.1%
SR 905 28
 
< 0.1%
SB I-15 26
 
< 0.1%
I-805/43RD STREET 23
 
< 0.1%
I-805/H STREET 19
 
< 0.1%
NB 805 AT SR-163 18
 
< 0.1%
NB 805 AT MURRAY RIDGE ROAD 14
 
< 0.1%
I-5/VIA DE SAN YSIDRO 14
 
< 0.1%
Other values (2201) 2828
 
0.7%
(Missing) 404618
99.2%

Length

2023-04-28T17:32:30.119100image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
at 960
 
8.2%
15 588
 
5.0%
sb 515
 
4.4%
500
 
4.3%
nb 440
 
3.8%
805 300
 
2.6%
i-15 266
 
2.3%
street 229
 
2.0%
and 206
 
1.8%
road 185
 
1.6%
Other values (803) 7487
64.1%

Most occurring characters

ValueCountFrequency (%)
8622
 
15.3%
5 2773
 
4.9%
A 2373
 
4.2%
E 2281
 
4.0%
T 2227
 
3.9%
a 2185
 
3.9%
R 2070
 
3.7%
I 1803
 
3.2%
S 1730
 
3.1%
N 1479
 
2.6%
Other values (64) 28883
51.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 21601
38.3%
Lowercase Letter 15708
27.8%
Space Separator 8622
 
15.3%
Decimal Number 7684
 
13.6%
Other Punctuation 1495
 
2.6%
Dash Punctuation 1304
 
2.3%
Open Punctuation 5
 
< 0.1%
Close Punctuation 5
 
< 0.1%
Math Symbol 1
 
< 0.1%
Connector Punctuation 1
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 2373
11.0%
E 2281
10.6%
T 2227
10.3%
R 2070
9.6%
I 1803
8.3%
S 1730
8.0%
N 1479
 
6.8%
O 1256
 
5.8%
B 1235
 
5.7%
D 823
 
3.8%
Other values (16) 4324
20.0%
Lowercase Letter
ValueCountFrequency (%)
a 2185
13.9%
r 1442
 
9.2%
t 1414
 
9.0%
e 1414
 
9.0%
n 1144
 
7.3%
s 1065
 
6.8%
o 1004
 
6.4%
b 963
 
6.1%
i 759
 
4.8%
l 658
 
4.2%
Other values (16) 3660
23.3%
Decimal Number
ValueCountFrequency (%)
5 2773
36.1%
1 1436
18.7%
8 1091
 
14.2%
0 834
 
10.9%
6 401
 
5.2%
4 298
 
3.9%
3 287
 
3.7%
9 277
 
3.6%
2 186
 
2.4%
7 101
 
1.3%
Other Punctuation
ValueCountFrequency (%)
/ 1091
73.0%
@ 344
 
23.0%
, 29
 
1.9%
. 16
 
1.1%
& 14
 
0.9%
! 1
 
0.1%
Space Separator
ValueCountFrequency (%)
8622
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1304
100.0%
Open Punctuation
ValueCountFrequency (%)
( 5
100.0%
Close Punctuation
ValueCountFrequency (%)
) 5
100.0%
Math Symbol
ValueCountFrequency (%)
= 1
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 37309
66.1%
Common 19117
33.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 2373
 
6.4%
E 2281
 
6.1%
T 2227
 
6.0%
a 2185
 
5.9%
R 2070
 
5.5%
I 1803
 
4.8%
S 1730
 
4.6%
N 1479
 
4.0%
r 1442
 
3.9%
t 1414
 
3.8%
Other values (42) 18305
49.1%
Common
ValueCountFrequency (%)
8622
45.1%
5 2773
 
14.5%
1 1436
 
7.5%
- 1304
 
6.8%
8 1091
 
5.7%
/ 1091
 
5.7%
0 834
 
4.4%
6 401
 
2.1%
@ 344
 
1.8%
4 298
 
1.6%
Other values (12) 923
 
4.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 56426
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
8622
 
15.3%
5 2773
 
4.9%
A 2373
 
4.2%
E 2281
 
4.0%
T 2227
 
3.9%
a 2185
 
3.9%
R 2070
 
3.7%
I 1803
 
3.2%
S 1730
 
3.1%
N 1479
 
2.6%
Other values (64) 28883
51.2%

is_school
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size6.2 MiB
0
407362 
1
 
322

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters407684
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 407362
99.9%
1 322
 
0.1%

Length

2023-04-28T17:32:30.181554image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-28T17:32:30.234858image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
0 407362
99.9%
1 322
 
0.1%

Most occurring characters

ValueCountFrequency (%)
0 407362
99.9%
1 322
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 407684
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 407362
99.9%
1 322
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 407684
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 407362
99.9%
1 322
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 407684
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 407362
99.9%
1 322
 
0.1%

school_name
Categorical

HIGH CARDINALITY  MISSING 

Distinct85
Distinct (%)26.4%
Missing407362
Missing (%)99.9%
Memory size6.2 MiB
Ibarra Elementary (San Diego Unified) 37683380108290
27 
Rancho Bernardo High (Poway Unified) 37682963730819
 
16
Del Norte High (Poway Unified) 37682960118935
 
15
Serra High (San Diego Unified) 37683383730173
 
13
The O'Farrell Charter (San Diego Unified) 37683386061964
 
13
Other values (80)
238 

Length

Max length69
Median length66
Mean length52.751553
Min length34

Characters and Unicode

Total characters16986
Distinct characters65
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique32 ?
Unique (%)9.9%

Sample

1st rowGarfield Elementary (San Diego Unified) 37683386039655
2nd rowGarfield Elementary (San Diego Unified) 37683386039655
3rd rowGarfield Elementary (San Diego Unified) 37683386039655
4th rowGarfield Elementary (San Diego Unified) 37683386039655
5th rowGrant K-8 (San Diego Unified) 37683386039671

Common Values

ValueCountFrequency (%)
Ibarra Elementary (San Diego Unified) 37683380108290 27
 
< 0.1%
Rancho Bernardo High (Poway Unified) 37682963730819 16
 
< 0.1%
Del Norte High (Poway Unified) 37682960118935 15
 
< 0.1%
Serra High (San Diego Unified) 37683383730173 13
 
< 0.1%
The O'Farrell Charter (San Diego Unified) 37683386061964 13
 
< 0.1%
De Portola Middle (San Diego Unified) 37683386106181 13
 
< 0.1%
Montgomery Senior High (Sweetwater Union High) 37684113738234 13
 
< 0.1%
Torrey Pines High (San Dieguito Union High) 37683463730033 11
 
< 0.1%
Sunset Elementary (San Ysidro Elementary) 37683796093264 11
 
< 0.1%
San Ysidro High (Sweetwater Union High) 37684113731502 9
 
< 0.1%
Other values (75) 181
 
< 0.1%
(Missing) 407362
99.9%

Length

2023-04-28T17:32:30.291019image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
san 222
 
10.8%
unified 221
 
10.7%
high 165
 
8.0%
diego 149
 
7.2%
elementary 144
 
7.0%
poway 72
 
3.5%
union 62
 
3.0%
middle 50
 
2.4%
ysidro 48
 
2.3%
sweetwater 31
 
1.5%
Other values (200) 896
43.5%

Most occurring characters

ValueCountFrequency (%)
1738
 
10.2%
e 1193
 
7.0%
i 1088
 
6.4%
3 1054
 
6.2%
n 915
 
5.4%
a 813
 
4.8%
6 688
 
4.1%
8 636
 
3.7%
r 616
 
3.6%
o 589
 
3.5%
Other values (55) 7656
45.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 8285
48.8%
Decimal Number 4515
26.6%
Uppercase Letter 1774
 
10.4%
Space Separator 1738
 
10.2%
Close Punctuation 322
 
1.9%
Open Punctuation 322
 
1.9%
Other Punctuation 23
 
0.1%
Dash Punctuation 7
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 1193
14.4%
i 1088
13.1%
n 915
11.0%
a 813
9.8%
r 616
7.4%
o 589
 
7.1%
d 431
 
5.2%
t 402
 
4.9%
g 375
 
4.5%
l 345
 
4.2%
Other values (14) 1518
18.3%
Uppercase Letter
ValueCountFrequency (%)
S 331
18.7%
U 285
16.1%
D 214
12.1%
H 179
10.1%
E 149
8.4%
P 113
 
6.4%
M 102
 
5.7%
C 90
 
5.1%
Y 48
 
2.7%
B 46
 
2.6%
Other values (14) 217
12.2%
Decimal Number
ValueCountFrequency (%)
3 1054
23.3%
6 688
15.2%
8 636
14.1%
7 542
12.0%
0 467
10.3%
1 377
 
8.3%
9 280
 
6.2%
2 206
 
4.6%
4 147
 
3.3%
5 118
 
2.6%
Other Punctuation
ValueCountFrequency (%)
' 13
56.5%
. 9
39.1%
/ 1
 
4.3%
Space Separator
ValueCountFrequency (%)
1738
100.0%
Close Punctuation
ValueCountFrequency (%)
) 322
100.0%
Open Punctuation
ValueCountFrequency (%)
( 322
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 10059
59.2%
Common 6927
40.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 1193
 
11.9%
i 1088
 
10.8%
n 915
 
9.1%
a 813
 
8.1%
r 616
 
6.1%
o 589
 
5.9%
d 431
 
4.3%
t 402
 
4.0%
g 375
 
3.7%
l 345
 
3.4%
Other values (38) 3292
32.7%
Common
ValueCountFrequency (%)
1738
25.1%
3 1054
15.2%
6 688
 
9.9%
8 636
 
9.2%
7 542
 
7.8%
0 467
 
6.7%
1 377
 
5.4%
) 322
 
4.6%
( 322
 
4.6%
9 280
 
4.0%
Other values (7) 501
 
7.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 16986
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1738
 
10.2%
e 1193
 
7.0%
i 1088
 
6.4%
3 1054
 
6.2%
n 915
 
5.4%
a 813
 
4.8%
6 688
 
4.1%
8 636
 
3.7%
r 616
 
3.6%
o 589
 
3.5%
Other values (55) 7656
45.1%

city
Categorical

Distinct46
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size6.2 MiB
SAN DIEGO
400853 
SAN YSIDRO
 
2072
CHULA VISTA
 
1033
NATIONAL CITY
 
783
EL CAJON
 
415
Other values (41)
 
2528

Length

Max length36
Median length9
Mean length9.0178766
Min length4

Characters and Unicode

Total characters3676444
Distinct characters25
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9 ?
Unique (%)< 0.1%

Sample

1st rowSAN DIEGO
2nd rowLA JOLLA
3rd rowSAN DIEGO
4th rowSAN DIEGO
5th rowSAN DIEGO

Common Values

ValueCountFrequency (%)
SAN DIEGO 400853
98.3%
SAN YSIDRO 2072
 
0.5%
CHULA VISTA 1033
 
0.3%
NATIONAL CITY 783
 
0.2%
EL CAJON 415
 
0.1%
ESCONDIDO 398
 
0.1%
LA MESA 310
 
0.1%
LA JOLLA 287
 
0.1%
LEMON GROVE 282
 
0.1%
SANTEE 209
 
0.1%
Other values (36) 1042
 
0.3%

Length

2023-04-28T17:32:30.359995image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
san 403140
49.5%
diego 400856
49.2%
ysidro 2072
 
0.3%
vista 1034
 
0.1%
chula 1033
 
0.1%
national 783
 
0.1%
city 783
 
0.1%
la 597
 
0.1%
el 415
 
0.1%
cajon 415
 
0.1%
Other values (49) 3319
 
0.4%

Most occurring characters

ValueCountFrequency (%)
A 409647
11.1%
S 407720
11.1%
406763
11.1%
I 406683
11.1%
O 406433
11.1%
N 406432
11.1%
D 403934
11.0%
E 403924
11.0%
G 401346
10.9%
L 4731
 
0.1%
Other values (15) 18831
 
0.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 3269678
88.9%
Space Separator 406763
 
11.1%
Dash Punctuation 3
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 409647
12.5%
S 407720
12.5%
I 406683
12.4%
O 406433
12.4%
N 406432
12.4%
D 403934
12.4%
E 403924
12.4%
G 401346
12.3%
L 4731
 
0.1%
Y 3255
 
0.1%
Other values (13) 15573
 
0.5%
Space Separator
ValueCountFrequency (%)
406763
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3269678
88.9%
Common 406766
 
11.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 409647
12.5%
S 407720
12.5%
I 406683
12.4%
O 406433
12.4%
N 406432
12.4%
D 403934
12.4%
E 403924
12.4%
G 401346
12.3%
L 4731
 
0.1%
Y 3255
 
0.1%
Other values (13) 15573
 
0.5%
Common
ValueCountFrequency (%)
406763
> 99.9%
- 3
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3676444
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 409647
11.1%
S 407720
11.1%
406763
11.1%
I 406683
11.1%
O 406433
11.1%
N 406432
11.1%
D 403934
11.0%
E 403924
11.0%
G 401346
10.9%
L 4731
 
0.1%
Other values (15) 18831
 
0.5%

beat
Real number (ℝ)

Distinct126
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean511.45697
Minimum111
Maximum999
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.2 MiB
2023-04-28T17:32:30.422211image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum111
5-th percentile121
Q1315
median521
Q3628
95-th percentile931
Maximum999
Range888
Interquartile range (IQR)313

Descriptive statistics

Standard deviation242.57677
Coefficient of variation (CV)0.47428578
Kurtosis-0.77389507
Mean511.45697
Median Absolute Deviation (MAD)194
Skewness-0.070033741
Sum2.0851282 × 108
Variance58843.489
MonotonicityNot monotonic
2023-04-28T17:32:30.483571image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
521 29982
 
7.4%
122 24592
 
6.0%
611 16419
 
4.0%
614 10569
 
2.6%
524 10192
 
2.5%
712 9851
 
2.4%
512 9651
 
2.4%
813 9582
 
2.4%
999 9483
 
2.3%
121 8744
 
2.1%
Other values (116) 268619
65.9%
ValueCountFrequency (%)
111 4467
 
1.1%
112 1542
 
0.4%
113 1514
 
0.4%
114 3534
 
0.9%
115 4463
 
1.1%
116 3624
 
0.9%
121 8744
 
2.1%
122 24592
6.0%
123 4779
 
1.2%
124 4479
 
1.1%
ValueCountFrequency (%)
999 9483
2.3%
937 1047
 
0.3%
936 393
 
0.1%
935 691
 
0.2%
934 5119
1.3%
933 1784
 
0.4%
932 481
 
0.1%
931 4407
1.1%
841 715
 
0.2%
839 1133
 
0.3%

beat_name
Categorical

Distinct127
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size6.2 MiB
East Village 521
 
29982
Pacific Beach 122
 
24592
Midway District 611
 
16419
Ocean Beach 614
 
10569
Core-Columbia 524
 
10192
Other values (122)
315930 

Length

Max length25
Median length22
Mean length15.909476
Min length10

Characters and Unicode

Total characters6486039
Distinct characters64
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCherokee Point 839
2nd rowLa Jolla 124
3rd rowPacific Beach 122
4th rowPacific Beach 122
5th rowOcean Beach 614

Common Values

ValueCountFrequency (%)
East Village 521 29982
 
7.4%
Pacific Beach 122 24592
 
6.0%
Midway District 611 16419
 
4.0%
Ocean Beach 614 10569
 
2.6%
Core-Columbia 524 10192
 
2.5%
San Ysidro 712 9851
 
2.4%
Logan Heights 512 9651
 
2.4%
North Park 813 9582
 
2.4%
Unknown 999 9411
 
2.3%
Mission Beach 121 8744
 
2.1%
Other values (117) 268691
65.9%

Length

2023-04-28T17:32:30.541055image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
east 48357
 
4.1%
beach 43905
 
3.7%
park 41321
 
3.5%
mesa 35028
 
3.0%
village 32954
 
2.8%
521 29982
 
2.5%
mission 26001
 
2.2%
pacific 24592
 
2.1%
122 24592
 
2.1%
heights 21711
 
1.8%
Other values (263) 851684
72.2%

Most occurring characters

ValueCountFrequency (%)
772443
 
11.9%
a 565930
 
8.7%
e 377516
 
5.8%
i 375627
 
5.8%
1 309397
 
4.8%
l 292727
 
4.5%
r 273757
 
4.2%
2 272499
 
4.2%
s 268438
 
4.1%
t 259143
 
4.0%
Other values (54) 2718562
41.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3682859
56.8%
Decimal Number 1223052
 
18.9%
Uppercase Letter 789444
 
12.2%
Space Separator 772443
 
11.9%
Dash Punctuation 10192
 
0.2%
Other Punctuation 8049
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 565930
15.4%
e 377516
10.3%
i 375627
10.2%
l 292727
7.9%
r 273757
7.4%
s 268438
7.3%
t 259143
7.0%
o 251522
6.8%
n 233816
 
6.3%
c 157915
 
4.3%
Other values (16) 626468
17.0%
Uppercase Letter
ValueCountFrequency (%)
M 109569
13.9%
P 80851
10.2%
B 75772
9.6%
C 72612
9.2%
V 69291
8.8%
E 55874
 
7.1%
H 46169
 
5.8%
S 41954
 
5.3%
L 40574
 
5.1%
O 27049
 
3.4%
Other values (14) 169729
21.5%
Decimal Number
ValueCountFrequency (%)
1 309397
25.3%
2 272499
22.3%
3 153382
12.5%
4 127141
10.4%
5 123115
 
10.1%
6 83727
 
6.8%
8 58189
 
4.8%
7 49152
 
4.0%
9 46450
 
3.8%
Other Punctuation
ValueCountFrequency (%)
/ 5351
66.5%
. 1816
 
22.6%
' 882
 
11.0%
Space Separator
ValueCountFrequency (%)
772443
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 10192
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4472303
69.0%
Common 2013736
31.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 565930
 
12.7%
e 377516
 
8.4%
i 375627
 
8.4%
l 292727
 
6.5%
r 273757
 
6.1%
s 268438
 
6.0%
t 259143
 
5.8%
o 251522
 
5.6%
n 233816
 
5.2%
c 157915
 
3.5%
Other values (40) 1415912
31.7%
Common
ValueCountFrequency (%)
772443
38.4%
1 309397
15.4%
2 272499
 
13.5%
3 153382
 
7.6%
4 127141
 
6.3%
5 123115
 
6.1%
6 83727
 
4.2%
8 58189
 
2.9%
7 49152
 
2.4%
9 46450
 
2.3%
Other values (4) 18241
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6486039
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
772443
 
11.9%
a 565930
 
8.7%
e 377516
 
5.8%
i 375627
 
5.8%
1 309397
 
4.8%
l 292727
 
4.5%
r 273757
 
4.2%
2 272499
 
4.2%
s 268438
 
4.1%
t 259143
 
4.0%
Other values (54) 2718562
41.9%

is_student
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size6.2 MiB
0
407522 
1
 
162

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters407684
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 407522
> 99.9%
1 162
 
< 0.1%

Length

2023-04-28T17:32:30.591265image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-28T17:32:30.638509image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
0 407522
> 99.9%
1 162
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
0 407522
> 99.9%
1 162
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 407684
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 407522
> 99.9%
1 162
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 407684
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 407522
> 99.9%
1 162
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 407684
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 407522
> 99.9%
1 162
 
< 0.1%

lim_eng
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size6.2 MiB
0
399890 
1
 
7794

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters407684
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row0
3rd row0
4th row0
5th row1

Common Values

ValueCountFrequency (%)
0 399890
98.1%
1 7794
 
1.9%

Length

2023-04-28T17:32:30.676858image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-28T17:32:30.723643image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
0 399890
98.1%
1 7794
 
1.9%

Most occurring characters

ValueCountFrequency (%)
0 399890
98.1%
1 7794
 
1.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 407684
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 399890
98.1%
1 7794
 
1.9%

Most occurring scripts

ValueCountFrequency (%)
Common 407684
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 399890
98.1%
1 7794
 
1.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 407684
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 399890
98.1%
1 7794
 
1.9%

age
Real number (ℝ)

Distinct102
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean37.315492
Minimum1
Maximum120
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.2 MiB
2023-04-28T17:32:30.768695image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile20
Q126
median35
Q346
95-th percentile60
Maximum120
Range119
Interquartile range (IQR)20

Descriptive statistics

Standard deviation13.43056
Coefficient of variation (CV)0.35991916
Kurtosis-0.22271582
Mean37.315492
Median Absolute Deviation (MAD)10
Skewness0.5885595
Sum15212929
Variance180.37995
MonotonicityNot monotonic
2023-04-28T17:32:30.827228image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
30 58428
14.3%
40 42673
 
10.5%
25 40305
 
9.9%
50 35877
 
8.8%
35 33339
 
8.2%
45 24881
 
6.1%
60 20837
 
5.1%
20 20273
 
5.0%
55 16130
 
4.0%
21 6938
 
1.7%
Other values (92) 108003
26.5%
ValueCountFrequency (%)
1 12
 
< 0.1%
2 5
 
< 0.1%
3 3
 
< 0.1%
4 13
 
< 0.1%
5 38
 
< 0.1%
6 13
 
< 0.1%
7 37
 
< 0.1%
8 71
 
< 0.1%
9 43
 
< 0.1%
10 320
0.1%
ValueCountFrequency (%)
120 3
 
< 0.1%
116 1
 
< 0.1%
100 15
< 0.1%
99 18
< 0.1%
98 3
 
< 0.1%
97 4
 
< 0.1%
96 1
 
< 0.1%
95 20
< 0.1%
94 6
 
< 0.1%
93 7
 
< 0.1%

gender_words
Categorical

Distinct4
Distinct (%)< 0.1%
Missing90
Missing (%)< 0.1%
Memory size6.2 MiB
Male
297732 
Female
108822 
Transgender man/boy
 
548
Transgender woman/girl
 
492

Length

Max length22
Median length4
Mean length4.5758672
Min length4

Characters and Unicode

Total characters1865096
Distinct characters19
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMale
2nd rowFemale
3rd rowFemale
4th rowMale
5th rowMale

Common Values

ValueCountFrequency (%)
Male 297732
73.0%
Female 108822
 
26.7%
Transgender man/boy 548
 
0.1%
Transgender woman/girl 492
 
0.1%
(Missing) 90
 
< 0.1%

Length

2023-04-28T17:32:30.882464image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-28T17:32:30.936711image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
male 297732
72.9%
female 108822
 
26.6%
transgender 1040
 
0.3%
man/boy 548
 
0.1%
woman/girl 492
 
0.1%

Most occurring characters

ValueCountFrequency (%)
e 517456
27.7%
a 408634
21.9%
l 407046
21.8%
M 297732
16.0%
m 109862
 
5.9%
F 108822
 
5.8%
n 3120
 
0.2%
r 2572
 
0.1%
g 1532
 
0.1%
1040
 
0.1%
Other values (9) 7280
 
0.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1455422
78.0%
Uppercase Letter 407594
 
21.9%
Space Separator 1040
 
0.1%
Other Punctuation 1040
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 517456
35.6%
a 408634
28.1%
l 407046
28.0%
m 109862
 
7.5%
n 3120
 
0.2%
r 2572
 
0.2%
g 1532
 
0.1%
o 1040
 
0.1%
s 1040
 
0.1%
d 1040
 
0.1%
Other values (4) 2080
 
0.1%
Uppercase Letter
ValueCountFrequency (%)
M 297732
73.0%
F 108822
 
26.7%
T 1040
 
0.3%
Space Separator
ValueCountFrequency (%)
1040
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 1040
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1863016
99.9%
Common 2080
 
0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 517456
27.8%
a 408634
21.9%
l 407046
21.8%
M 297732
16.0%
m 109862
 
5.9%
F 108822
 
5.8%
n 3120
 
0.2%
r 2572
 
0.1%
g 1532
 
0.1%
o 1040
 
0.1%
Other values (7) 5200
 
0.3%
Common
ValueCountFrequency (%)
1040
50.0%
/ 1040
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1865096
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 517456
27.7%
a 408634
21.9%
l 407046
21.8%
M 297732
16.0%
m 109862
 
5.9%
F 108822
 
5.8%
n 3120
 
0.2%
r 2572
 
0.1%
g 1532
 
0.1%
1040
 
0.1%
Other values (9) 7280
 
0.4%

is_gendnc
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size6.2 MiB
0
407507 
1
 
177

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters407684
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 407507
> 99.9%
1 177
 
< 0.1%

Length

2023-04-28T17:32:31.140396image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-28T17:32:31.186025image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
0 407507
> 99.9%
1 177
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
0 407507
> 99.9%
1 177
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 407684
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 407507
> 99.9%
1 177
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 407684
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 407507
> 99.9%
1 177
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 407684
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 407507
> 99.9%
1 177
 
< 0.1%

gender_code
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size6.2 MiB
1
297732 
2
108822 
3
 
548
4
 
492
0
 
90

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters407684
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row2
3rd row2
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 297732
73.0%
2 108822
 
26.7%
3 548
 
0.1%
4 492
 
0.1%
0 90
 
< 0.1%

Length

2023-04-28T17:32:31.224956image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-28T17:32:31.275286image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
1 297732
73.0%
2 108822
 
26.7%
3 548
 
0.1%
4 492
 
0.1%
0 90
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
1 297732
73.0%
2 108822
 
26.7%
3 548
 
0.1%
4 492
 
0.1%
0 90
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 407684
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 297732
73.0%
2 108822
 
26.7%
3 548
 
0.1%
4 492
 
0.1%
0 90
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 407684
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 297732
73.0%
2 108822
 
26.7%
3 548
 
0.1%
4 492
 
0.1%
0 90
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 407684
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 297732
73.0%
2 108822
 
26.7%
3 548
 
0.1%
4 492
 
0.1%
0 90
 
< 0.1%

gendnc_code
Categorical

CONSTANT  MISSING 

Distinct1
Distinct (%)0.6%
Missing407507
Missing (%)> 99.9%
Memory size6.2 MiB
5.0
177 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters531
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row5.0
2nd row5.0
3rd row5.0
4th row5.0
5th row5.0

Common Values

ValueCountFrequency (%)
5.0 177
 
< 0.1%
(Missing) 407507
> 99.9%

Length

2023-04-28T17:32:31.319432image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-28T17:32:31.363983image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
5.0 177
100.0%

Most occurring characters

ValueCountFrequency (%)
5 177
33.3%
. 177
33.3%
0 177
33.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 354
66.7%
Other Punctuation 177
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 177
50.0%
0 177
50.0%
Other Punctuation
ValueCountFrequency (%)
. 177
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 531
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
5 177
33.3%
. 177
33.3%
0 177
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 531
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 177
33.3%
. 177
33.3%
0 177
33.3%

lgbt
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.5 MiB
False
396960 
True
 
10724
ValueCountFrequency (%)
False 396960
97.4%
True 10724
 
2.6%
2023-04-28T17:32:31.405662image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

race
Categorical

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size6.2 MiB
white
170777 
hisp
119669 
black
82052 
asian
31260 
nhopi
 
3112

Length

Max length5
Median length5
Mean length4.7044696
Min length4

Characters and Unicode

Total characters1917937
Distinct characters14
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowhisp
2nd rowwhite
3rd rowwhite
4th rowhisp
5th rowblack

Common Values

ValueCountFrequency (%)
white 170777
41.9%
hisp 119669
29.4%
black 82052
20.1%
asian 31260
 
7.7%
nhopi 3112
 
0.8%
aian 814
 
0.2%

Length

2023-04-28T17:32:31.444668image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-28T17:32:31.498922image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
white 170777
41.9%
hisp 119669
29.4%
black 82052
20.1%
asian 31260
 
7.7%
nhopi 3112
 
0.8%
aian 814
 
0.2%

Most occurring characters

ValueCountFrequency (%)
i 325632
17.0%
h 293558
15.3%
w 170777
8.9%
t 170777
8.9%
e 170777
8.9%
s 150929
7.9%
a 146200
7.6%
p 122781
 
6.4%
b 82052
 
4.3%
l 82052
 
4.3%
Other values (4) 202402
10.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1917937
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 325632
17.0%
h 293558
15.3%
w 170777
8.9%
t 170777
8.9%
e 170777
8.9%
s 150929
7.9%
a 146200
7.6%
p 122781
 
6.4%
b 82052
 
4.3%
l 82052
 
4.3%
Other values (4) 202402
10.6%

Most occurring scripts

ValueCountFrequency (%)
Latin 1917937
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 325632
17.0%
h 293558
15.3%
w 170777
8.9%
t 170777
8.9%
e 170777
8.9%
s 150929
7.9%
a 146200
7.6%
p 122781
 
6.4%
b 82052
 
4.3%
l 82052
 
4.3%
Other values (4) 202402
10.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1917937
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 325632
17.0%
h 293558
15.3%
w 170777
8.9%
t 170777
8.9%
e 170777
8.9%
s 150929
7.9%
a 146200
7.6%
p 122781
 
6.4%
b 82052
 
4.3%
l 82052
 
4.3%
Other values (4) 202402
10.6%

disability
Categorical

HIGH CARDINALITY  IMBALANCE 

Distinct134
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size6.2 MiB
None
388791 
Mental health condition
 
13848
Other disability
 
1776
Intellectual or developmental disability, including dementia
 
626
Speech impairment or limited use of language
 
558
Other values (129)
 
2085

Length

Max length201
Median length4
Mean length5.1228329
Min length4

Characters and Unicode

Total characters2088497
Distinct characters30
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique65 ?
Unique (%)< 0.1%

Sample

1st rowNone
2nd rowNone
3rd rowNone
4th rowNone
5th rowNone

Common Values

ValueCountFrequency (%)
None 388791
95.4%
Mental health condition 13848
 
3.4%
Other disability 1776
 
0.4%
Intellectual or developmental disability, including dementia 626
 
0.2%
Speech impairment or limited use of language 558
 
0.1%
Deafness or difficulty hearing 461
 
0.1%
Intellectual or developmental disability, including dementia|Mental health condition 285
 
0.1%
Blind or limited vision 268
 
0.1%
Mental health condition|Intellectual or developmental disability, including dementia 241
 
0.1%
Mental health condition|Other disability 124
 
< 0.1%
Other values (124) 706
 
0.2%

Length

2023-04-28T17:32:31.561315image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
none 388791
85.4%
health 14978
 
3.3%
condition 14399
 
3.2%
mental 14372
 
3.2%
disability 3359
 
0.7%
or 3355
 
0.7%
other 1954
 
0.4%
developmental 1387
 
0.3%
including 1387
 
0.3%
limited 1275
 
0.3%
Other values (50) 10082
 
2.2%

Most occurring characters

ValueCountFrequency (%)
n 444578
21.3%
e 438417
21.0%
o 424782
20.3%
N 388791
18.6%
t 59149
 
2.8%
i 52496
 
2.5%
47655
 
2.3%
l 45125
 
2.2%
a 41741
 
2.0%
h 33735
 
1.6%
Other values (20) 112028
 
5.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1628493
78.0%
Uppercase Letter 409323
 
19.6%
Space Separator 47655
 
2.3%
Math Symbol 1639
 
0.1%
Other Punctuation 1387
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 444578
27.3%
e 438417
26.9%
o 424782
26.1%
t 59149
 
3.6%
i 52496
 
3.2%
l 45125
 
2.8%
a 41741
 
2.6%
h 33735
 
2.1%
d 25090
 
1.5%
c 19323
 
1.2%
Other values (10) 44057
 
2.7%
Uppercase Letter
ValueCountFrequency (%)
N 388791
95.0%
M 14978
 
3.7%
O 2199
 
0.5%
I 1387
 
0.3%
S 878
 
0.2%
D 693
 
0.2%
B 397
 
0.1%
Space Separator
ValueCountFrequency (%)
47655
100.0%
Math Symbol
ValueCountFrequency (%)
| 1639
100.0%
Other Punctuation
ValueCountFrequency (%)
, 1387
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2037816
97.6%
Common 50681
 
2.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 444578
21.8%
e 438417
21.5%
o 424782
20.8%
N 388791
19.1%
t 59149
 
2.9%
i 52496
 
2.6%
l 45125
 
2.2%
a 41741
 
2.0%
h 33735
 
1.7%
d 25090
 
1.2%
Other values (17) 83912
 
4.1%
Common
ValueCountFrequency (%)
47655
94.0%
| 1639
 
3.2%
, 1387
 
2.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2088497
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 444578
21.3%
e 438417
21.0%
o 424782
20.3%
N 388791
18.6%
t 59149
 
2.8%
i 52496
 
2.5%
47655
 
2.3%
l 45125
 
2.2%
a 41741
 
2.0%
h 33735
 
1.6%
Other values (20) 112028
 
5.4%

reason_words
Categorical

Distinct8
Distinct (%)< 0.1%
Missing5
Missing (%)< 0.1%
Memory size6.2 MiB
Reasonable Suspicion
213781 
Traffic Violation
175061 
Investigation to determine whether the person was truant
 
5358
Known to be on Parole / Probation / PRCS / Mandatory Supervision
 
5126
Consensual Encounter resulting in a search
 
4418
Other values (3)
 
3935

Length

Max length113
Median length20
Mean length20.295608
Min length17

Characters and Unicode

Total characters8274093
Distinct characters43
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowTraffic Violation
2nd rowReasonable Suspicion
3rd rowReasonable Suspicion
4th rowReasonable Suspicion
5th rowReasonable Suspicion

Common Values

ValueCountFrequency (%)
Reasonable Suspicion 213781
52.4%
Traffic Violation 175061
42.9%
Investigation to determine whether the person was truant 5358
 
1.3%
Known to be on Parole / Probation / PRCS / Mandatory Supervision 5126
 
1.3%
Consensual Encounter resulting in a search 4418
 
1.1%
Knowledge of outstanding arrest warrant/wanted person 3904
 
1.0%
Determine whether the student violated school policy 27
 
< 0.1%
Possible conduct warranting discipline under Education Code sections 48900, 48900.2, 48900.3, 48900.4 and 48900.7 4
 
< 0.1%
(Missing) 5
 
< 0.1%

Length

2023-04-28T17:32:31.613864image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-28T17:32:31.672014image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
reasonable 213781
22.9%
suspicion 213781
22.9%
traffic 175061
18.8%
violation 175061
18.8%
15378
 
1.6%
to 10484
 
1.1%
person 9262
 
1.0%
determine 5385
 
0.6%
whether 5385
 
0.6%
the 5385
 
0.6%
Other values (40) 103274
11.1%

Most occurring characters

ValueCountFrequency (%)
i 997046
12.1%
o 859346
 
10.4%
a 847079
 
10.2%
n 710187
 
8.6%
524558
 
6.3%
e 523232
 
6.3%
s 478220
 
5.8%
l 406797
 
4.9%
c 397752
 
4.8%
f 354026
 
4.3%
Other values (33) 2175850
26.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6888154
83.2%
Uppercase Letter 841955
 
10.2%
Space Separator 524558
 
6.3%
Other Punctuation 19310
 
0.2%
Decimal Number 116
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 997046
14.5%
o 859346
12.5%
a 847079
12.3%
n 710187
10.3%
e 523232
7.6%
s 478220
6.9%
l 406797
 
5.9%
c 397752
 
5.8%
f 354026
 
5.1%
t 261837
 
3.8%
Other values (11) 1052632
15.3%
Uppercase Letter
ValueCountFrequency (%)
S 224033
26.6%
R 218907
26.0%
V 175061
20.8%
T 175061
20.8%
P 15382
 
1.8%
C 9548
 
1.1%
K 9030
 
1.1%
I 5358
 
0.6%
M 5126
 
0.6%
E 4422
 
0.5%
Decimal Number
ValueCountFrequency (%)
0 40
34.5%
4 24
20.7%
8 20
17.2%
9 20
17.2%
2 4
 
3.4%
3 4
 
3.4%
7 4
 
3.4%
Other Punctuation
ValueCountFrequency (%)
/ 19282
99.9%
. 16
 
0.1%
, 12
 
0.1%
Space Separator
ValueCountFrequency (%)
524558
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7730109
93.4%
Common 543984
 
6.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 997046
12.9%
o 859346
11.1%
a 847079
11.0%
n 710187
 
9.2%
e 523232
 
6.8%
s 478220
 
6.2%
l 406797
 
5.3%
c 397752
 
5.1%
f 354026
 
4.6%
t 261837
 
3.4%
Other values (22) 1894587
24.5%
Common
ValueCountFrequency (%)
524558
96.4%
/ 19282
 
3.5%
0 40
 
< 0.1%
4 24
 
< 0.1%
8 20
 
< 0.1%
9 20
 
< 0.1%
. 16
 
< 0.1%
, 12
 
< 0.1%
2 4
 
< 0.1%
3 4
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8274093
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 997046
12.1%
o 859346
 
10.4%
a 847079
 
10.2%
n 710187
 
8.6%
524558
 
6.3%
e 523232
 
6.3%
s 478220
 
5.8%
l 406797
 
4.9%
c 397752
 
4.8%
f 354026
 
4.3%
Other values (33) 2175850
26.3%

reasonid
Real number (ℝ)

Distinct1693
Distinct (%)0.4%
Missing18844
Missing (%)4.6%
Infinite0
Infinite (%)0.0%
Mean51661.962
Minimum3
Maximum99999
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.2 MiB
2023-04-28T17:32:31.744084image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum3
5-th percentile22004
Q141063
median54153
Q354655
95-th percentile99990
Maximum99999
Range99996
Interquartile range (IQR)13592

Descriptive statistics

Standard deviation17956.083
Coefficient of variation (CV)0.34756875
Kurtosis1.5025841
Mean51661.962
Median Absolute Deviation (MAD)2017
Skewness0.460685
Sum2.0088237 × 1010
Variance3.2242093 × 108
MonotonicityNot monotonic
2023-04-28T17:32:31.803243image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
65002 28446
 
7.0%
32022 18215
 
4.5%
99990 17159
 
4.2%
32111 16823
 
4.1%
54167 16209
 
4.0%
65000 14655
 
3.6%
54106 14147
 
3.5%
54655 11917
 
2.9%
41063 9946
 
2.4%
54146 9512
 
2.3%
Other values (1683) 231811
56.9%
(Missing) 18844
 
4.6%
ValueCountFrequency (%)
3 27
< 0.1%
3065 1
 
< 0.1%
3068 1
 
< 0.1%
4021 6
 
< 0.1%
4022 67
< 0.1%
4023 1
 
< 0.1%
4026 1
 
< 0.1%
4028 1
 
< 0.1%
4031 5
 
< 0.1%
4032 4
 
< 0.1%
ValueCountFrequency (%)
99999 5911
 
1.4%
99990 17159
4.2%
89105 4
 
< 0.1%
89005 9
 
< 0.1%
66218 3
 
< 0.1%
66211 82
 
< 0.1%
66210 204
 
0.1%
66208 1646
 
0.4%
66207 169
 
< 0.1%
66206 6
 
< 0.1%

reason_text
Categorical

HIGH CARDINALITY  MISSING 

Distinct1697
Distinct (%)0.4%
Missing18844
Missing (%)4.6%
Memory size6.2 MiB
65002 ZZ - LOCAL ORDINANCE VIOL (I) 65002
 
28446
602 PC - TRESPASSING (M) 32022
 
18215
647(E) PC - DIS CON:LODGE W/O CONSENT (M) 32111
 
16823
22450(A) VC - FAIL STOP VEH:XWALK/ETC (I) 54167
 
16209
65000 ZZ - LOCAL ORDINANCE VIOL (M) 65000
 
14655
Other values (1692)
294492 

Length

Max length56
Median length53
Mean length44.110012
Min length24

Characters and Unicode

Total characters17151737
Distinct characters49
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique445 ?
Unique (%)0.1%

Sample

1st row27150(A) VC - INADEQUATE MUFFLERS (I) 54116
2nd row415(2) PC - LOUD/UNREASONABLE NOISE (I) 53130
3rd row647(F) PC - DISORD CONDUCT:ALCOHOL (M) 64005
4th row647(F) PC - DISORD CONDUCT:ALCOHOL (M) 64005
5th row602 PC - TRESPASSING (M) 32022

Common Values

ValueCountFrequency (%)
65002 ZZ - LOCAL ORDINANCE VIOL (I) 65002 28446
 
7.0%
602 PC - TRESPASSING (M) 32022 18215
 
4.5%
647(E) PC - DIS CON:LODGE W/O CONSENT (M) 32111 16823
 
4.1%
22450(A) VC - FAIL STOP VEH:XWALK/ETC (I) 54167 16209
 
4.0%
65000 ZZ - LOCAL ORDINANCE VIOL (M) 65000 14655
 
3.6%
NA - XX ZZ - COMMUNITY CARETAKING (X) 99990 14557
 
3.6%
22350 VC - UNSAFE SPEED:PREVAIL COND (I) 54106 14147
 
3.5%
25620 BP - POSS OPEN ALCOHOL:PUBLIC (I) 41063 9946
 
2.4%
23123.5 VC - NO HND HLD DEVICE W/DRIVE (I) 54655 9532
 
2.3%
21461(A) VC - DRIVER FAIL OBEY SIGN/ETC (I) 54146 9512
 
2.3%
Other values (1687) 236798
58.1%
(Missing) 18844
 
4.6%

Length

2023-04-28T17:32:31.863204image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
412682
 
12.9%
i 218922
 
6.8%
vc 183432
 
5.7%
m 124213
 
3.9%
pc 111743
 
3.5%
zz 62689
 
2.0%
65002 56892
 
1.8%
fail 51302
 
1.6%
viol 51009
 
1.6%
local 43101
 
1.3%
Other values (5845) 1894179
59.0%

Most occurring characters

ValueCountFrequency (%)
2821324
 
16.4%
I 819904
 
4.8%
E 777645
 
4.5%
C 738813
 
4.3%
A 689810
 
4.0%
( 631649
 
3.7%
) 631602
 
3.7%
O 612034
 
3.6%
0 601951
 
3.5%
2 564425
 
3.3%
Other values (39) 8262580
48.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 8716397
50.8%
Decimal Number 3603590
21.0%
Space Separator 2821324
 
16.4%
Open Punctuation 631649
 
3.7%
Close Punctuation 631602
 
3.7%
Dash Punctuation 413250
 
2.4%
Other Punctuation 332906
 
1.9%
Currency Symbol 829
 
< 0.1%
Math Symbol 190
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
I 819904
 
9.4%
E 777645
 
8.9%
C 738813
 
8.5%
A 689810
 
7.9%
O 612034
 
7.0%
N 548839
 
6.3%
L 545179
 
6.3%
T 438047
 
5.0%
S 427035
 
4.9%
P 415949
 
4.8%
Other values (16) 2703142
31.0%
Decimal Number
ValueCountFrequency (%)
0 601951
16.7%
2 564425
15.7%
5 561441
15.6%
4 463016
12.8%
1 439987
12.2%
6 334787
9.3%
3 279634
7.8%
9 172665
 
4.8%
7 118941
 
3.3%
8 66743
 
1.9%
Other Punctuation
ValueCountFrequency (%)
/ 169870
51.0%
: 132445
39.8%
. 27167
 
8.2%
& 3285
 
1.0%
' 95
 
< 0.1%
" 42
 
< 0.1%
, 2
 
< 0.1%
Space Separator
ValueCountFrequency (%)
2821324
100.0%
Open Punctuation
ValueCountFrequency (%)
( 631649
100.0%
Close Punctuation
ValueCountFrequency (%)
) 631602
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 413250
100.0%
Currency Symbol
ValueCountFrequency (%)
$ 829
100.0%
Math Symbol
ValueCountFrequency (%)
+ 190
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8716397
50.8%
Common 8435340
49.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
I 819904
 
9.4%
E 777645
 
8.9%
C 738813
 
8.5%
A 689810
 
7.9%
O 612034
 
7.0%
N 548839
 
6.3%
L 545179
 
6.3%
T 438047
 
5.0%
S 427035
 
4.9%
P 415949
 
4.8%
Other values (16) 2703142
31.0%
Common
ValueCountFrequency (%)
2821324
33.4%
( 631649
 
7.5%
) 631602
 
7.5%
0 601951
 
7.1%
2 564425
 
6.7%
5 561441
 
6.7%
4 463016
 
5.5%
1 439987
 
5.2%
- 413250
 
4.9%
6 334787
 
4.0%
Other values (13) 971908
 
11.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 17151737
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2821324
 
16.4%
I 819904
 
4.8%
E 777645
 
4.5%
C 738813
 
4.3%
A 689810
 
4.0%
( 631649
 
3.7%
) 631602
 
3.7%
O 612034
 
3.6%
0 601951
 
3.5%
2 564425
 
3.3%
Other values (39) 8262580
48.2%

reason_detail
Categorical

HIGH CARDINALITY  IMBALANCE  MISSING 

Distinct282
Distinct (%)0.1%
Missing18838
Missing (%)4.6%
Memory size6.2 MiB
Moving Violation
107984 
Officer witnessed commission of a crime
85183 
Matched suspect description
68932 
Equipment Violation
51210 
Other Reasonable Suspicion of a crime
36636 
Other values (277)
38901 

Length

Max length232
Median length210
Mean length29.898466
Min length16

Characters and Unicode

Total characters11625899
Distinct characters46
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique131 ?
Unique (%)< 0.1%

Sample

1st rowEquipment Violation
2nd rowOfficer witnessed commission of a crime
3rd rowOfficer witnessed commission of a crime
4th rowOfficer witnessed commission of a crime
5th rowMatched suspect description

Common Values

ValueCountFrequency (%)
Moving Violation 107984
26.5%
Officer witnessed commission of a crime 85183
20.9%
Matched suspect description 68932
16.9%
Equipment Violation 51210
12.6%
Other Reasonable Suspicion of a crime 36636
 
9.0%
Non-moving Violation, including Registration Violation 15867
 
3.9%
Witness or Victim identification of Suspect at the scene 8664
 
2.1%
Matched suspect description|Witness or Victim identification of Suspect at the scene 2714
 
0.7%
Matched suspect description|Officer witnessed commission of a crime 2175
 
0.5%
Witness or Victim identification of Suspect at the scene|Matched suspect description 1522
 
0.4%
Other values (272) 7959
 
2.0%
(Missing) 18838
 
4.6%

Length

2023-04-28T17:32:31.927537image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
violation 190928
12.3%
of 146809
 
9.5%
a 130440
 
8.4%
crime 126477
 
8.2%
moving 107984
 
7.0%
suspect 93364
 
6.0%
witnessed 89822
 
5.8%
commission 89822
 
5.8%
officer 86627
 
5.6%
matched 75197
 
4.9%
Other values (85) 412092
26.6%

Most occurring characters

ValueCountFrequency (%)
i 1468469
12.6%
1160716
 
10.0%
o 1060373
 
9.1%
e 913193
 
7.9%
n 838158
 
7.2%
s 755743
 
6.5%
t 753249
 
6.5%
c 671800
 
5.8%
a 532953
 
4.6%
m 392075
 
3.4%
Other values (36) 3079170
26.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 9702646
83.5%
Space Separator 1160716
 
10.0%
Uppercase Letter 717984
 
6.2%
Other Punctuation 15873
 
0.1%
Dash Punctuation 15871
 
0.1%
Math Symbol 12785
 
0.1%
Decimal Number 24
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 1468469
15.1%
o 1060373
10.9%
e 913193
9.4%
n 838158
8.6%
s 755743
 
7.8%
t 753249
 
7.8%
c 671800
 
6.9%
a 532953
 
5.5%
m 392075
 
4.0%
r 373189
 
3.8%
Other values (15) 1943444
20.0%
Uppercase Letter
ValueCountFrequency (%)
V 205240
28.6%
M 187036
26.1%
O 129951
18.1%
R 55291
 
7.7%
S 54783
 
7.6%
E 51210
 
7.1%
N 15867
 
2.2%
W 14312
 
2.0%
A 3251
 
0.5%
C 705
 
0.1%
Decimal Number
ValueCountFrequency (%)
0 8
33.3%
4 6
25.0%
8 4
16.7%
9 4
16.7%
7 2
 
8.3%
Other Punctuation
ValueCountFrequency (%)
, 15869
> 99.9%
. 4
 
< 0.1%
Space Separator
ValueCountFrequency (%)
1160716
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 15871
100.0%
Math Symbol
ValueCountFrequency (%)
| 12785
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 10420630
89.6%
Common 1205269
 
10.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 1468469
14.1%
o 1060373
 
10.2%
e 913193
 
8.8%
n 838158
 
8.0%
s 755743
 
7.3%
t 753249
 
7.2%
c 671800
 
6.4%
a 532953
 
5.1%
m 392075
 
3.8%
r 373189
 
3.6%
Other values (26) 2661428
25.5%
Common
ValueCountFrequency (%)
1160716
96.3%
- 15871
 
1.3%
, 15869
 
1.3%
| 12785
 
1.1%
0 8
 
< 0.1%
4 6
 
< 0.1%
8 4
 
< 0.1%
9 4
 
< 0.1%
. 4
 
< 0.1%
7 2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11625899
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 1468469
12.6%
1160716
 
10.0%
o 1060373
 
9.1%
e 913193
 
7.9%
n 838158
 
7.2%
s 755743
 
6.5%
t 753249
 
6.5%
c 671800
 
5.8%
a 532953
 
4.6%
m 392075
 
3.4%
Other values (36) 3079170
26.5%

reason_exp
Categorical

Distinct183583
Distinct (%)45.0%
Missing82
Missing (%)< 0.1%
Memory size6.2 MiB
cell phone
 
4819
stop sign
 
4721
speeding
 
4497
SPEED
 
4155
encroachment
 
3658
Other values (183578)
385752 

Length

Max length250
Median length235
Mean length28.53013
Min length2

Characters and Unicode

Total characters11628938
Distinct characters92
Distinct categories12 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique155880 ?
Unique (%)38.2%

Sample

1st rowLOUD EXHAUST
2nd rowloud party
3rd rowstumbling back and forth, unable to maintain balance
4th rowfighting with security
5th rowrc of male at vacant house

Common Values

ValueCountFrequency (%)
cell phone 4819
 
1.2%
stop sign 4721
 
1.2%
speeding 4497
 
1.1%
SPEED 4155
 
1.0%
encroachment 3658
 
0.9%
radio call 3482
 
0.9%
speed 3479
 
0.9%
STOP SIGN 2446
 
0.6%
CELL PHONE 2071
 
0.5%
ped stop 2056
 
0.5%
Other values (183573) 372218
91.3%

Length

2023-04-28T17:32:31.993135image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
in 54094
 
2.7%
subject 51302
 
2.6%
of 50374
 
2.6%
on 43020
 
2.2%
a 40364
 
2.1%
was 37897
 
1.9%
to 35570
 
1.8%
stop 35343
 
1.8%
call 29540
 
1.5%
and 27506
 
1.4%
Other values (28030) 1562251
79.4%

Most occurring characters

ValueCountFrequency (%)
1563291
 
13.4%
e 750025
 
6.4%
i 566936
 
4.9%
t 537857
 
4.6%
a 530621
 
4.6%
n 527560
 
4.5%
o 497686
 
4.3%
s 430617
 
3.7%
r 398458
 
3.4%
l 366458
 
3.2%
Other values (82) 5459429
46.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6810897
58.6%
Uppercase Letter 2968066
25.5%
Space Separator 1563291
 
13.4%
Decimal Number 177838
 
1.5%
Other Punctuation 96107
 
0.8%
Dash Punctuation 6337
 
0.1%
Open Punctuation 3051
 
< 0.1%
Close Punctuation 3006
 
< 0.1%
Math Symbol 273
 
< 0.1%
Currency Symbol 62
 
< 0.1%
Other values (2) 10
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 750025
 
11.0%
i 566936
 
8.3%
t 537857
 
7.9%
a 530621
 
7.8%
n 527560
 
7.7%
o 497686
 
7.3%
s 430617
 
6.3%
r 398458
 
5.9%
l 366458
 
5.4%
d 296457
 
4.4%
Other values (16) 1908222
28.0%
Uppercase Letter
ValueCountFrequency (%)
E 317034
 
10.7%
I 242750
 
8.2%
N 222256
 
7.5%
T 221118
 
7.4%
S 214555
 
7.2%
A 214253
 
7.2%
O 211596
 
7.1%
R 174216
 
5.9%
L 159276
 
5.4%
D 140110
 
4.7%
Other values (16) 850902
28.7%
Other Punctuation
ValueCountFrequency (%)
. 63525
66.1%
, 17132
 
17.8%
/ 10546
 
11.0%
' 2515
 
2.6%
& 907
 
0.9%
" 635
 
0.7%
; 263
 
0.3%
# 170
 
0.2%
: 156
 
0.2%
@ 89
 
0.1%
Other values (5) 169
 
0.2%
Decimal Number
ValueCountFrequency (%)
5 43365
24.4%
1 38792
21.8%
0 30811
17.3%
4 19052
10.7%
2 14599
 
8.2%
6 9982
 
5.6%
3 6180
 
3.5%
7 5266
 
3.0%
8 5078
 
2.9%
9 4713
 
2.7%
Math Symbol
ValueCountFrequency (%)
+ 211
77.3%
> 42
 
15.4%
= 10
 
3.7%
< 8
 
2.9%
~ 2
 
0.7%
Open Punctuation
ValueCountFrequency (%)
( 3004
98.5%
[ 47
 
1.5%
Close Punctuation
ValueCountFrequency (%)
) 2983
99.2%
] 23
 
0.8%
Modifier Symbol
ValueCountFrequency (%)
` 4
80.0%
^ 1
 
20.0%
Space Separator
ValueCountFrequency (%)
1563291
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 6337
100.0%
Currency Symbol
ValueCountFrequency (%)
$ 62
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 9778963
84.1%
Common 1849975
 
15.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 750025
 
7.7%
i 566936
 
5.8%
t 537857
 
5.5%
a 530621
 
5.4%
n 527560
 
5.4%
o 497686
 
5.1%
s 430617
 
4.4%
r 398458
 
4.1%
l 366458
 
3.7%
E 317034
 
3.2%
Other values (42) 4855711
49.7%
Common
ValueCountFrequency (%)
1563291
84.5%
. 63525
 
3.4%
5 43365
 
2.3%
1 38792
 
2.1%
0 30811
 
1.7%
4 19052
 
1.0%
, 17132
 
0.9%
2 14599
 
0.8%
/ 10546
 
0.6%
6 9982
 
0.5%
Other values (30) 38880
 
2.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11628938
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1563291
 
13.4%
e 750025
 
6.4%
i 566936
 
4.9%
t 537857
 
4.6%
a 530621
 
4.6%
n 527560
 
4.5%
o 497686
 
4.3%
s 430617
 
3.7%
r 398458
 
3.4%
l 366458
 
3.2%
Other values (82) 5459429
46.9%

search_basis
Categorical

HIGH CARDINALITY  IMBALANCE  MISSING 

Distinct721
Distinct (%)0.8%
Missing321160
Missing (%)78.8%
Memory size6.2 MiB
Incident to arrest
39048 
Condition of parole / probation/ PRCS / mandatory supervision
23248 
Consent given
5589 
Officer Safety/safety of others
3961 
Vehicle inventory
 
1963
Other values (716)
12715 

Length

Max length182
Median length174
Mean length34.487506
Min length13

Characters and Unicode

Total characters2983997
Distinct characters34
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique396 ?
Unique (%)0.5%

Sample

1st rowVehicle inventory
2nd rowIncident to arrest
3rd rowIncident to arrest
4th rowIncident to arrest
5th rowIncident to arrest

Common Values

ValueCountFrequency (%)
Incident to arrest 39048
 
9.6%
Condition of parole / probation/ PRCS / mandatory supervision 23248
 
5.7%
Consent given 5589
 
1.4%
Officer Safety/safety of others 3961
 
1.0%
Vehicle inventory 1963
 
0.5%
Condition of parole / probation/ PRCS / mandatory supervision|Incident to arrest 1034
 
0.3%
Visible contraband 938
 
0.2%
Incident to arrest|Officer Safety/safety of others 911
 
0.2%
Incident to arrest|Condition of parole / probation/ PRCS / mandatory supervision 715
 
0.2%
Consent given|Incident to arrest 564
 
0.1%
Other values (711) 8553
 
2.1%
(Missing) 321160
78.8%

Length

2023-04-28T17:32:32.059949image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
53566
12.3%
to 45600
10.5%
arrest 42100
9.7%
incident 42093
9.7%
of 36452
8.4%
parole 26783
 
6.2%
probation 26783
 
6.2%
prcs 26783
 
6.2%
mandatory 26783
 
6.2%
condition 25187
 
5.8%
Other values (128) 83147
19.1%

Most occurring characters

ValueCountFrequency (%)
348753
11.7%
o 294376
 
9.9%
n 268287
 
9.0%
t 257125
 
8.6%
r 222991
 
7.5%
e 214760
 
7.2%
i 209486
 
7.0%
a 176466
 
5.9%
s 129107
 
4.3%
d 105784
 
3.5%
Other values (24) 756862
25.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2321650
77.8%
Space Separator 348753
 
11.7%
Uppercase Letter 213531
 
7.2%
Other Punctuation 88097
 
3.0%
Math Symbol 11966
 
0.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 294376
12.7%
n 268287
11.6%
t 257125
11.1%
r 222991
9.6%
e 214760
9.3%
i 209486
9.0%
a 176466
7.6%
s 129107
 
5.6%
d 105784
 
4.6%
p 83631
 
3.6%
Other values (12) 359637
15.5%
Uppercase Letter
ValueCountFrequency (%)
C 62715
29.4%
I 45600
21.4%
S 36333
17.0%
R 26783
12.5%
P 26783
12.5%
O 8261
 
3.9%
V 4990
 
2.3%
E 1652
 
0.8%
W 414
 
0.2%
Space Separator
ValueCountFrequency (%)
348753
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 88097
100.0%
Math Symbol
ValueCountFrequency (%)
| 11966
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2535181
85.0%
Common 448816
 
15.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 294376
11.6%
n 268287
10.6%
t 257125
10.1%
r 222991
 
8.8%
e 214760
 
8.5%
i 209486
 
8.3%
a 176466
 
7.0%
s 129107
 
5.1%
d 105784
 
4.2%
p 83631
 
3.3%
Other values (21) 573168
22.6%
Common
ValueCountFrequency (%)
348753
77.7%
/ 88097
 
19.6%
| 11966
 
2.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2983997
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
348753
11.7%
o 294376
 
9.9%
n 268287
 
9.0%
t 257125
 
8.6%
r 222991
 
7.5%
e 214760
 
7.2%
i 209486
 
7.0%
a 176466
 
5.9%
s 129107
 
4.3%
d 105784
 
3.5%
Other values (24) 756862
25.4%

search_basis_exp
Categorical

HIGH CARDINALITY  MISSING 

Distinct28990
Distinct (%)45.7%
Missing344258
Missing (%)84.4%
Memory size6.2 MiB
incident to arrest
 
2366
search incident to arrest
 
1541
arrest
 
1321
arrested
 
800
Incident to arrest
 
772
Other values (28985)
56626 

Length

Max length250
Median length236
Mean length27.766011
Min length3

Characters and Unicode

Total characters1761087
Distinct characters88
Distinct categories11 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique24904 ?
Unique (%)39.3%

Sample

1st rowIMPOUNDED
2nd rowsearch incident to arrest
3rd rowsearch incident to arrest
4th rowMale drunk in public
5th row273.6 viloation of TRO

Common Values

ValueCountFrequency (%)
incident to arrest 2366
 
0.6%
search incident to arrest 1541
 
0.4%
arrest 1321
 
0.3%
arrested 800
 
0.2%
Incident to arrest 772
 
0.2%
INCIDENT TO ARREST 652
 
0.2%
searched incident to arrest 572
 
0.1%
consent 550
 
0.1%
5150 hold 516
 
0.1%
consent search 465
 
0.1%
Other values (28980) 53871
 
13.2%
(Missing) 344258
84.4%

Length

2023-04-28T17:32:32.126258image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
to 20323
 
7.0%
arrest 20146
 
7.0%
for 13693
 
4.7%
incident 12131
 
4.2%
search 9435
 
3.3%
arrested 9119
 
3.2%
subject 8539
 
3.0%
was 8139
 
2.8%
searched 6252
 
2.2%
and 5808
 
2.0%
Other values (7320) 175627
60.7%

Most occurring characters

ValueCountFrequency (%)
226198
 
12.8%
e 138232
 
7.8%
r 116668
 
6.6%
t 104510
 
5.9%
a 100275
 
5.7%
n 85671
 
4.9%
s 78939
 
4.5%
o 76673
 
4.4%
i 63628
 
3.6%
c 59719
 
3.4%
Other values (78) 710574
40.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1099455
62.4%
Uppercase Letter 359554
 
20.4%
Space Separator 226198
 
12.8%
Decimal Number 54950
 
3.1%
Other Punctuation 14716
 
0.8%
Close Punctuation 2745
 
0.2%
Open Punctuation 2743
 
0.2%
Dash Punctuation 699
 
< 0.1%
Math Symbol 14
 
< 0.1%
Currency Symbol 10
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 138232
12.6%
r 116668
10.6%
t 104510
9.5%
a 100275
9.1%
n 85671
 
7.8%
s 78939
 
7.2%
o 76673
 
7.0%
i 63628
 
5.8%
c 59719
 
5.4%
d 58870
 
5.4%
Other values (16) 216270
19.7%
Uppercase Letter
ValueCountFrequency (%)
E 39943
11.1%
R 33124
9.2%
A 32432
9.0%
T 31278
 
8.7%
S 30653
 
8.5%
N 26150
 
7.3%
O 23930
 
6.7%
I 22840
 
6.4%
C 20263
 
5.6%
D 19228
 
5.3%
Other values (16) 79713
22.2%
Other Punctuation
ValueCountFrequency (%)
. 9951
67.6%
, 2588
 
17.6%
/ 1104
 
7.5%
& 597
 
4.1%
' 293
 
2.0%
: 62
 
0.4%
; 53
 
0.4%
" 40
 
0.3%
# 15
 
0.1%
? 4
 
< 0.1%
Other values (4) 9
 
0.1%
Decimal Number
ValueCountFrequency (%)
5 14267
26.0%
1 12107
22.0%
0 8044
14.6%
4 5518
 
10.0%
2 4591
 
8.4%
6 3142
 
5.7%
7 2588
 
4.7%
3 2167
 
3.9%
9 1394
 
2.5%
8 1132
 
2.1%
Math Symbol
ValueCountFrequency (%)
+ 6
42.9%
= 5
35.7%
> 2
 
14.3%
< 1
 
7.1%
Close Punctuation
ValueCountFrequency (%)
) 2744
> 99.9%
] 1
 
< 0.1%
Modifier Symbol
ValueCountFrequency (%)
^ 2
66.7%
` 1
33.3%
Space Separator
ValueCountFrequency (%)
226198
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2743
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 699
100.0%
Currency Symbol
ValueCountFrequency (%)
$ 10
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1459009
82.8%
Common 302078
 
17.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 138232
 
9.5%
r 116668
 
8.0%
t 104510
 
7.2%
a 100275
 
6.9%
n 85671
 
5.9%
s 78939
 
5.4%
o 76673
 
5.3%
i 63628
 
4.4%
c 59719
 
4.1%
d 58870
 
4.0%
Other values (42) 575824
39.5%
Common
ValueCountFrequency (%)
226198
74.9%
5 14267
 
4.7%
1 12107
 
4.0%
. 9951
 
3.3%
0 8044
 
2.7%
4 5518
 
1.8%
2 4591
 
1.5%
6 3142
 
1.0%
) 2744
 
0.9%
( 2743
 
0.9%
Other values (26) 12773
 
4.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1761087
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
226198
 
12.8%
e 138232
 
7.8%
r 116668
 
6.6%
t 104510
 
5.9%
a 100275
 
5.7%
n 85671
 
4.9%
s 78939
 
4.5%
o 76673
 
4.4%
i 63628
 
3.6%
c 59719
 
3.4%
Other values (78) 710574
40.3%

seiz_basis
Categorical

Distinct49
Distinct (%)0.5%
Missing398568
Missing (%)97.8%
Memory size6.2 MiB
Evidence
2999 
Contraband
2090 
Impound of vehicle
1217 
Contraband|Evidence
1049 
Evidence|Contraband
551 
Other values (44)
1210 

Length

Max length76
Median length65
Mean length15.307591
Min length8

Characters and Unicode

Total characters139544
Distinct characters30
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique12 ?
Unique (%)0.1%

Sample

1st rowContraband
2nd rowEvidence|Impound of vehicle
3rd rowContraband|Evidence
4th rowImpound of vehicle
5th rowEvidence

Common Values

ValueCountFrequency (%)
Evidence 2999
 
0.7%
Contraband 2090
 
0.5%
Impound of vehicle 1217
 
0.3%
Contraband|Evidence 1049
 
0.3%
Evidence|Contraband 551
 
0.1%
Safekeeping as allowed by law/statute 387
 
0.1%
Evidence|Impound of vehicle 237
 
0.1%
Contraband|Evidence|Impound of vehicle 106
 
< 0.1%
Abandoned property 76
 
< 0.1%
Contraband|Impound of vehicle 57
 
< 0.1%
Other values (39) 347
 
0.1%
(Missing) 398568
97.8%

Length

2023-04-28T17:32:32.184134image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
evidence 2999
19.9%
contraband 2090
13.9%
of 1787
11.9%
vehicle 1662
11.0%
impound 1307
8.7%
contraband|evidence 1049
 
7.0%
as 562
 
3.7%
allowed 562
 
3.7%
by 562
 
3.7%
evidence|contraband 551
 
3.7%
Other values (38) 1913
12.7%

Most occurring characters

ValueCountFrequency (%)
e 17040
12.2%
n 15843
 
11.4%
d 11812
 
8.5%
a 10977
 
7.9%
o 8379
 
6.0%
i 7575
 
5.4%
c 7013
 
5.0%
v 7011
 
5.0%
5928
 
4.2%
t 5823
 
4.2%
Other values (20) 42143
30.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 118754
85.1%
Uppercase Letter 11708
 
8.4%
Space Separator 5928
 
4.2%
Math Symbol 2592
 
1.9%
Other Punctuation 562
 
0.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 17040
14.3%
n 15843
13.3%
d 11812
9.9%
a 10977
9.2%
o 8379
 
7.1%
i 7575
 
6.4%
c 7013
 
5.9%
v 7011
 
5.9%
t 5823
 
4.9%
b 4697
 
4.0%
Other values (12) 22584
19.0%
Uppercase Letter
ValueCountFrequency (%)
E 5224
44.6%
C 4031
34.4%
I 1786
 
15.3%
S 563
 
4.8%
A 104
 
0.9%
Space Separator
ValueCountFrequency (%)
5928
100.0%
Math Symbol
ValueCountFrequency (%)
| 2592
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 562
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 130462
93.5%
Common 9082
 
6.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 17040
13.1%
n 15843
12.1%
d 11812
 
9.1%
a 10977
 
8.4%
o 8379
 
6.4%
i 7575
 
5.8%
c 7013
 
5.4%
v 7011
 
5.4%
t 5823
 
4.5%
E 5224
 
4.0%
Other values (17) 33765
25.9%
Common
ValueCountFrequency (%)
5928
65.3%
| 2592
28.5%
/ 562
 
6.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 139544
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 17040
12.2%
n 15843
 
11.4%
d 11812
 
8.5%
a 10977
 
7.9%
o 8379
 
6.0%
i 7575
 
5.4%
c 7013
 
5.0%
v 7011
 
5.0%
5928
 
4.2%
t 5823
 
4.2%
Other values (20) 42143
30.2%

prop_type
Categorical

HIGH CARDINALITY  MISSING 

Distinct490
Distinct (%)5.4%
Missing398568
Missing (%)97.8%
Memory size6.2 MiB
Drugs/narcotics
1551 
Vehicle
1194 
Drug Paraphernalia
1115 
Drugs/narcotics|Drug Paraphernalia
794 
Other Contraband or evidence
586 
Other values (485)
3876 

Length

Max length202
Median length151
Mean length27.323058
Min length5

Characters and Unicode

Total characters249077
Distinct characters35
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique259 ?
Unique (%)2.8%

Sample

1st rowDrugs/narcotics
2nd rowAlcohol
3rd rowDrugs/narcotics|Money|Drug Paraphernalia
4th rowVehicle
5th rowFirearm(s)|Ammunition|Cell phone(s) or electronic device(s)

Common Values

ValueCountFrequency (%)
Drugs/narcotics 1551
 
0.4%
Vehicle 1194
 
0.3%
Drug Paraphernalia 1115
 
0.3%
Drugs/narcotics|Drug Paraphernalia 794
 
0.2%
Other Contraband or evidence 586
 
0.1%
Weapon(s) other than a firearm 509
 
0.1%
Alcohol 483
 
0.1%
Drug Paraphernalia|Drugs/narcotics 236
 
0.1%
Cell phone(s) or electronic device(s) 201
 
< 0.1%
Suspected Stolen property 176
 
< 0.1%
Other values (480) 2271
 
0.6%
(Missing) 398568
97.8%

Length

2023-04-28T17:32:32.246618image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
paraphernalia 2089
 
8.8%
or 2046
 
8.6%
drug 1570
 
6.6%
drugs/narcotics 1551
 
6.5%
other 1518
 
6.4%
vehicle 1194
 
5.0%
contraband 1171
 
4.9%
drugs/narcotics|drug 1057
 
4.4%
evidence 1054
 
4.4%
a 876
 
3.7%
Other values (233) 9672
40.6%

Most occurring characters

ValueCountFrequency (%)
r 26702
 
10.7%
a 22214
 
8.9%
e 22150
 
8.9%
n 15726
 
6.3%
14682
 
5.9%
c 14101
 
5.7%
o 13723
 
5.5%
i 13625
 
5.5%
s 11269
 
4.5%
t 10895
 
4.4%
Other values (25) 83990
33.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 200742
80.6%
Uppercase Letter 18711
 
7.5%
Space Separator 14682
 
5.9%
Math Symbol 4834
 
1.9%
Other Punctuation 3760
 
1.5%
Close Punctuation 3174
 
1.3%
Open Punctuation 3174
 
1.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 26702
13.3%
a 22214
11.1%
e 22150
11.0%
n 15726
 
7.8%
c 14101
 
7.0%
o 13723
 
6.8%
i 13625
 
6.8%
s 11269
 
5.6%
t 10895
 
5.4%
h 9023
 
4.5%
Other values (10) 41314
20.6%
Uppercase Letter
ValueCountFrequency (%)
D 6775
36.2%
P 3015
16.1%
C 2046
 
10.9%
V 1623
 
8.7%
O 1171
 
6.3%
S 1150
 
6.1%
A 1028
 
5.5%
W 876
 
4.7%
F 548
 
2.9%
M 479
 
2.6%
Space Separator
ValueCountFrequency (%)
14682
100.0%
Math Symbol
ValueCountFrequency (%)
| 4834
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 3760
100.0%
Close Punctuation
ValueCountFrequency (%)
) 3174
100.0%
Open Punctuation
ValueCountFrequency (%)
( 3174
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 219453
88.1%
Common 29624
 
11.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 26702
12.2%
a 22214
 
10.1%
e 22150
 
10.1%
n 15726
 
7.2%
c 14101
 
6.4%
o 13723
 
6.3%
i 13625
 
6.2%
s 11269
 
5.1%
t 10895
 
5.0%
h 9023
 
4.1%
Other values (20) 60025
27.4%
Common
ValueCountFrequency (%)
14682
49.6%
| 4834
 
16.3%
/ 3760
 
12.7%
) 3174
 
10.7%
( 3174
 
10.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 249077
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 26702
 
10.7%
a 22214
 
8.9%
e 22150
 
8.9%
n 15726
 
6.3%
14682
 
5.9%
c 14101
 
5.7%
o 13723
 
5.5%
i 13625
 
5.5%
s 11269
 
4.5%
t 10895
 
4.4%
Other values (25) 83990
33.7%

cont
Categorical

HIGH CARDINALITY  IMBALANCE 

Distinct669
Distinct (%)0.2%
Missing5
Missing (%)< 0.1%
Memory size6.2 MiB
None
369726 
Alcohol
 
11476
Drugs/narcotics
 
6526
Drug Paraphernalia
 
4943
Drugs/narcotics|Drug Paraphernalia
 
2538
Other values (664)
 
12470

Length

Max length194
Median length4
Mean length5.6240277
Min length4

Characters and Unicode

Total characters2292798
Distinct characters35
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique322 ?
Unique (%)0.1%

Sample

1st rowNone
2nd rowNone
3rd rowNone
4th rowNone
5th rowNone

Common Values

ValueCountFrequency (%)
None 369726
90.7%
Alcohol 11476
 
2.8%
Drugs/narcotics 6526
 
1.6%
Drug Paraphernalia 4943
 
1.2%
Drugs/narcotics|Drug Paraphernalia 2538
 
0.6%
Weapon(s) other than a firearm 2222
 
0.5%
Other Contraband or evidence 2068
 
0.5%
Drug Paraphernalia|Drugs/narcotics 947
 
0.2%
Suspected Stolen property 731
 
0.2%
Firearm(s) 665
 
0.2%
Other values (659) 5837
 
1.4%

Length

2023-04-28T17:32:32.316682image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
none 369726
81.8%
alcohol 11476
 
2.5%
paraphernalia 8064
 
1.8%
drugs/narcotics 6526
 
1.4%
drug 6396
 
1.4%
other 5435
 
1.2%
or 5160
 
1.1%
than 3255
 
0.7%
a 3255
 
0.7%
contraband 3241
 
0.7%
Other values (301) 29237
 
6.5%

Most occurring characters

ValueCountFrequency (%)
o 431619
18.8%
e 424236
18.5%
n 418639
18.3%
N 369726
16.1%
r 86923
 
3.8%
a 75823
 
3.3%
c 48344
 
2.1%
44092
 
1.9%
l 42132
 
1.8%
i 37630
 
1.6%
Other values (25) 313634
13.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1771285
77.3%
Uppercase Letter 434995
 
19.0%
Space Separator 44092
 
1.9%
Other Punctuation 12790
 
0.6%
Math Symbol 12012
 
0.5%
Open Punctuation 8812
 
0.4%
Close Punctuation 8812
 
0.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 431619
24.4%
e 424236
24.0%
n 418639
23.6%
r 86923
 
4.9%
a 75823
 
4.3%
c 48344
 
2.7%
l 42132
 
2.4%
i 37630
 
2.1%
s 36002
 
2.0%
h 34279
 
1.9%
Other values (10) 135658
 
7.7%
Uppercase Letter
ValueCountFrequency (%)
N 369726
85.0%
D 23243
 
5.3%
A 13323
 
3.1%
P 10453
 
2.4%
C 5160
 
1.2%
W 3255
 
0.7%
O 3241
 
0.7%
S 3220
 
0.7%
F 1719
 
0.4%
M 1655
 
0.4%
Space Separator
ValueCountFrequency (%)
44092
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 12790
100.0%
Math Symbol
ValueCountFrequency (%)
| 12012
100.0%
Open Punctuation
ValueCountFrequency (%)
( 8812
100.0%
Close Punctuation
ValueCountFrequency (%)
) 8812
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2206280
96.2%
Common 86518
 
3.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 431619
19.6%
e 424236
19.2%
n 418639
19.0%
N 369726
16.8%
r 86923
 
3.9%
a 75823
 
3.4%
c 48344
 
2.2%
l 42132
 
1.9%
i 37630
 
1.7%
s 36002
 
1.6%
Other values (20) 235206
10.7%
Common
ValueCountFrequency (%)
44092
51.0%
/ 12790
 
14.8%
| 12012
 
13.9%
( 8812
 
10.2%
) 8812
 
10.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2292798
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 431619
18.8%
e 424236
18.5%
n 418639
18.3%
N 369726
16.1%
r 86923
 
3.8%
a 75823
 
3.3%
c 48344
 
2.1%
44092
 
1.9%
l 42132
 
1.8%
i 37630
 
1.6%
Other values (25) 313634
13.7%

actions
Categorical

HIGH CARDINALITY  IMBALANCE 

Distinct11672
Distinct (%)2.9%
Missing5
Missing (%)< 0.1%
Memory size6.2 MiB
None
246838 
Curbside detention
 
24321
Handcuffed or flex cuffed
 
16308
Search of person was conducted|Handcuffed or flex cuffed
 
9208
Handcuffed or flex cuffed|Search of person was conducted
 
9022
Other values (11667)
101982 

Length

Max length360
Median length4
Mean length28.120136
Min length4

Characters and Unicode

Total characters11463989
Distinct characters38
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8059 ?
Unique (%)2.0%

Sample

1st rowSearch of property was conducted|Vehicle impounded
2nd rowCurbside detention
3rd rowPatrol car detention|Handcuffed or flex cuffed|Search of person was conducted
4th rowCurbside detention|Handcuffed or flex cuffed|Search of person was conducted
5th rowNone

Common Values

ValueCountFrequency (%)
None 246838
60.5%
Curbside detention 24321
 
6.0%
Handcuffed or flex cuffed 16308
 
4.0%
Search of person was conducted|Handcuffed or flex cuffed 9208
 
2.3%
Handcuffed or flex cuffed|Search of person was conducted 9022
 
2.2%
Patrol car detention|Handcuffed or flex cuffed 3113
 
0.8%
Curbside detention|Handcuffed or flex cuffed 2976
 
0.7%
Patrol car detention|Search of person was conducted|Handcuffed or flex cuffed 2743
 
0.7%
Person photographed 2531
 
0.6%
Search of person was conducted|Handcuffed or flex cuffed|Search of property was conducted 2520
 
0.6%
Other values (11662) 88099
 
21.6%

Length

2023-04-28T17:32:32.386077image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
none 246838
15.5%
was 125241
 
7.8%
or 117174
 
7.3%
of 116125
 
7.3%
flex 111013
 
6.9%
person 90179
 
5.6%
cuffed 51222
 
3.2%
conducted 47480
 
3.0%
search 45074
 
2.8%
handcuffed 44529
 
2.8%
Other values (233) 602459
37.7%

Most occurring characters

ValueCountFrequency (%)
e 1462183
12.8%
1189655
 
10.4%
o 1051750
 
9.2%
n 837630
 
7.3%
d 819897
 
7.2%
r 725015
 
6.3%
f 707926
 
6.2%
c 705365
 
6.2%
t 485449
 
4.2%
a 474788
 
4.1%
Other values (28) 3004331
26.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 9391409
81.9%
Space Separator 1189655
 
10.4%
Uppercase Letter 648121
 
5.7%
Math Symbol 234804
 
2.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 1462183
15.6%
o 1051750
11.2%
n 837630
8.9%
d 819897
8.7%
r 725015
7.7%
f 707926
7.5%
c 705365
7.5%
t 485449
 
5.2%
a 474788
 
5.1%
u 406584
 
4.3%
Other values (15) 1714822
18.3%
Uppercase Letter
ValueCountFrequency (%)
N 246838
38.1%
S 116125
17.9%
H 111013
17.1%
P 81745
 
12.6%
C 58772
 
9.1%
A 16520
 
2.5%
V 10181
 
1.6%
F 6635
 
1.0%
E 190
 
< 0.1%
I 55
 
< 0.1%
Space Separator
ValueCountFrequency (%)
1189655
100.0%
Math Symbol
ValueCountFrequency (%)
| 234804
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 10039530
87.6%
Common 1424459
 
12.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 1462183
14.6%
o 1051750
10.5%
n 837630
 
8.3%
d 819897
 
8.2%
r 725015
 
7.2%
f 707926
 
7.1%
c 705365
 
7.0%
t 485449
 
4.8%
a 474788
 
4.7%
u 406584
 
4.0%
Other values (26) 2362943
23.5%
Common
ValueCountFrequency (%)
1189655
83.5%
| 234804
 
16.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11463989
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 1462183
12.8%
1189655
 
10.4%
o 1051750
 
9.2%
n 837630
 
7.3%
d 819897
 
7.2%
r 725015
 
6.3%
f 707926
 
6.2%
c 705365
 
6.2%
t 485449
 
4.2%
a 474788
 
4.1%
Other values (28) 3004331
26.2%

act_consent
Categorical

HIGH CARDINALITY  IMBALANCE  MISSING 

Distinct335
Distinct (%)0.3%
Missing297641
Missing (%)73.0%
Memory size6.2 MiB
NA|NA
38272 
NA|NA|NA
29881 
NA|NA|NA|NA
17701 
NA|NA|NA|NA|NA
7547 
NA|NA|NA|NA|NA|NA
 
2403
Other values (330)
14239 

Length

Max length35
Median length34
Mean length8.251429
Min length1

Characters and Unicode

Total characters908012
Distinct characters4
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique95 ?
Unique (%)0.1%

Sample

1st rowNA|NA
2nd rowNA|NA|NA
3rd rowNA|NA|NA
4th rowNA|NA
5th rowNA|NA|NA

Common Values

ValueCountFrequency (%)
NA|NA 38272
 
9.4%
NA|NA|NA 29881
 
7.3%
NA|NA|NA|NA 17701
 
4.3%
NA|NA|NA|NA|NA 7547
 
1.9%
NA|NA|NA|NA|NA|NA 2403
 
0.6%
Y|NA 1814
 
0.4%
NA|Y 1112
 
0.3%
Y|NA|NA 1095
 
0.3%
NA|Y|NA 892
 
0.2%
NA|NA|NA|NA|NA|NA|NA 827
 
0.2%
Other values (325) 8499
 
2.1%
(Missing) 297641
73.0%

Length

2023-04-28T17:32:32.455137image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na|na 38272
34.8%
na|na|na 29881
27.2%
na|na|na|na 17701
16.1%
na|na|na|na|na 7547
 
6.9%
na|na|na|na|na|na 2403
 
2.2%
y|na 1814
 
1.6%
na|y 1112
 
1.0%
y|na|na 1095
 
1.0%
na|y|na 892
 
0.8%
na|na|na|na|na|na|na 827
 
0.8%
Other values (325) 8499
 
7.7%

Most occurring characters

ValueCountFrequency (%)
N 330813
36.4%
A 328361
36.2%
| 234804
25.9%
Y 14034
 
1.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 673208
74.1%
Math Symbol 234804
 
25.9%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 330813
49.1%
A 328361
48.8%
Y 14034
 
2.1%
Math Symbol
ValueCountFrequency (%)
| 234804
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 673208
74.1%
Common 234804
 
25.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 330813
49.1%
A 328361
48.8%
Y 14034
 
2.1%
Common
ValueCountFrequency (%)
| 234804
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 908012
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 330813
36.4%
A 328361
36.2%
| 234804
25.9%
Y 14034
 
1.5%

Interactions

2023-04-28T17:32:21.899126image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:15.247115image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:16.030376image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:16.855048image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:17.543579image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:18.247176image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:18.956097image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:19.749861image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:20.446125image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:21.168632image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:21.980355image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:15.361650image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:16.103279image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:16.924719image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:17.615948image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:18.323660image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:19.026965image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:19.823473image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:20.519973image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:21.243257image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:22.058903image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:15.468297image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:16.175854image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:16.995034image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:17.690758image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:18.397927image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:19.096805image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:19.899184image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:20.591146image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:21.316482image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:22.134601image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:15.541064image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:16.245158image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:17.059681image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:17.756876image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:18.465753image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:19.165547image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:19.971932image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:20.659853image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:21.389450image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:22.210039image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:15.608505image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:16.316041image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:17.127558image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:17.824200image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:18.533032image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:19.230276image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:20.040157image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:20.729980image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:21.461895image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:22.284941image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:15.678108image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:16.401045image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:17.197277image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:17.896432image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:18.603357image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:19.296796image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:20.110398image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:20.799267image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:21.535012image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:22.354330image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:15.747198image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:16.575618image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:17.267140image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:17.964604image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:18.671449image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:19.363682image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:20.172628image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:20.869167image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:21.605385image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:22.427186image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:15.817360image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:16.648374image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:17.336226image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:18.037629image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:18.741724image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:19.432149image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:20.239858image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:20.941848image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:21.676911image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:22.497805image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:15.885607image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:16.716816image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:17.400219image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:18.105510image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:18.811059image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:19.610485image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:20.306767image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:21.006296image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:21.744968image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:22.573610image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:15.960077image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:16.789357image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:17.472387image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:18.177113image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:18.887252image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:19.682937image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:20.374102image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:21.085524image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-28T17:32:21.822416image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Missing values

2023-04-28T17:32:23.371682image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
A simple visualization of nullity by column.
2023-04-28T17:32:24.862367image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-04-28T17:32:27.924288image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

Unnamed: 0stop_idpididoriagencyexp_yearsdatetimeduris_servassign_keyassign_wordsintersblockldmkstreethw_exitis_schoolschool_namecitybeatbeat_nameis_studentlim_engagegender_wordsis_gendncgender_codegendnc_codelgbtracedisabilityreason_wordsreasonidreason_textreason_detailreason_expsearch_basissearch_basis_expseiz_basisprop_typecontactionsact_consent
0184362184362_1CA0371100SD102019-01-0100:15:073001Patrol, traffic enforcement, field operationsNaN3500.0NaNUNIVERSITYNaN0NaNSAN DIEGO839Cherokee Point 8390130Male01NaNNohispNoneTraffic Violation54116.027150(A) VC - INADEQUATE MUFFLERS (I) 54116Equipment ViolationLOUD EXHAUSTVehicle inventoryIMPOUNDEDNaNNaNNoneSearch of property was conducted|Vehicle impoundedNA|NA
1284364184364_1CA0371100SD22019-01-0100:15:161001Patrol, traffic enforcement, field operationsNaN7500.0NaNhillside drNaN0NaNLA JOLLA124La Jolla 1240044Female02NaNNowhiteNoneReasonable Suspicion53130.0415(2) PC - LOUD/UNREASONABLE NOISE (I) 53130Officer witnessed commission of a crimeloud partyNaNNaNNaNNaNNoneCurbside detentionNaN
2384365184365_1CA0371100SD12019-01-0100:02:00501Patrol, traffic enforcement, field operationsNaN1300.0NaNocean blvdNaN0NaNSAN DIEGO122Pacific Beach 1220030Female02NaNNowhiteNoneReasonable Suspicion64005.0647(F) PC - DISORD CONDUCT:ALCOHOL (M) 64005Officer witnessed commission of a crimestumbling back and forth, unable to maintain balanceIncident to arrestsearch incident to arrestNaNNaNNonePatrol car detention|Handcuffed or flex cuffed|Search of person was conductedNA|NA|NA
3484366184366_1CA0371100SD12019-01-0100:38:00501Patrol, traffic enforcement, field operationsNaN800.0NaNgarnetNaN0NaNSAN DIEGO122Pacific Beach 1220025Male01NaNNohispNoneReasonable Suspicion64005.0647(F) PC - DISORD CONDUCT:ALCOHOL (M) 64005Officer witnessed commission of a crimefighting with securityIncident to arrestsearch incident to arrestNaNNaNNoneCurbside detention|Handcuffed or flex cuffed|Search of person was conductedNA|NA|NA
4584369184369_1CA0371100SD172019-01-0101:06:41211Patrol, traffic enforcement, field operationsNaN4400.0NaNcoronadoNaN0NaNSAN DIEGO614Ocean Beach 6140140Male01NaNNoblackNoneReasonable Suspicion32022.0602 PC - TRESPASSING (M) 32022Matched suspect descriptionrc of male at vacant houseNaNNaNNaNNaNNoneNoneNaN
5684370184370_1CA0371100SD12019-01-0101:11:05501Patrol, traffic enforcement, field operationsgovernor drNaNNaNradcliffeNaN0NaNSAN DIEGO115University City 1150075Female02NaNNowhiteNoneTraffic Violation54110.024601 VC - FAIL MAINT LIC PLATE LAMP (I) 54110Equipment Violationno license plate lightsNaNNaNNaNNaNNoneNoneNaN
6784371184371_1CA0371100SD12019-01-0101:15:566001Patrol, traffic enforcement, field operationsla jolla village drNaNNaNvilla la jolla drNaN0NaNSAN DIEGO126Torrey Pines 1260045Male01NaNNowhiteNoneTraffic Violation54056.020002 VC - HIT AND RUN (M) 54056Moving Violationdrive hit victim vehicle causing damage and minor injury and fled on footNaNNaNNaNNaNNoneNoneNaN
7884372184372_1CA0371100SD22019-01-0101:10:541001Patrol, traffic enforcement, field operationsNaN1000.0NaNpacific beach drNaN0NaNSAN DIEGO122Pacific Beach 1220025Male01NaNNohispNoneReasonable Suspicion64005.0647(F) PC - DISORD CONDUCT:ALCOHOL (M) 64005Officer witnessed commission of a crimefell in streetNaNNaNNaNNaNNoneCurbside detentionNaN
8984372284372_2CA0371100SD22019-01-0101:10:541001Patrol, traffic enforcement, field operationsNaN1000.0NaNpacific beach drNaN0NaNSAN DIEGO122Pacific Beach 1220023Female02NaNNohispNoneReasonable Suspicion64005.0647(F) PC - DISORD CONDUCT:ALCOHOL (M) 64005Officer witnessed commission of a crimefell in streetNaNNaNNaNNaNNoneCurbside detentionNaN
91084373184373_1CA0371100SD92019-01-0101:10:52501Patrol, traffic enforcement, field operationsNaN300.0NaN5th AvNaN0NaNSAN DIEGO523Gaslamp 5230021Male01NaNNohispNoneReasonable Suspicion64005.0647(F) PC - DISORD CONDUCT:ALCOHOL (M) 64005Officer witnessed commission of a crimeMale drunk in public unable to care for himselfIncident to arrestMale drunk in publicNaNNaNNoneHandcuffed or flex cuffed|Search of person was conductedNA|NA
Unnamed: 0stop_idpididoriagencyexp_yearsdatetimeduris_servassign_keyassign_wordsintersblockldmkstreethw_exitis_schoolschool_namecitybeatbeat_nameis_studentlim_engagegender_wordsis_gendncgender_codegendnc_codelgbtracedisabilityreason_wordsreasonidreason_textreason_detailreason_expsearch_basissearch_basis_expseiz_basisprop_typecontactionsact_consent
69804698054496873449687_3CA0371100SD12021-06-3015:35:006001Patrol, traffic enforcement, field operationsNaN100.0NaNW San Ysidro blvdNaN0NaNSAN YSIDRO712San Ysidro 7120060Male01NaNNowhiteNoneTraffic Violation54431.024951(B) VC - TURN SIGNAL VIOLATION (I) 54431Moving Violationpulled over vehicle for not using turn signal for a lane change and for the 3rd brakelight being out.NaNNaNNaNNaNNonePerson removed from vehicle by orderNaN
69805698064496921449692_1CA0371100SD12021-06-3022:46:002001Patrol, traffic enforcement, field operationsNaN4300.0NaNUniversityNaN0NaNSAN DIEGO832Teralta West 8320018Male01NaNNohispNoneTraffic Violation54168.05204(A) VC - EXPIRED TABS/FAIL DISPLAY (I) 54168Non-moving Violation, including Registration Violationdisplayed reg expired over 6 monthsNaNNaNNaNNaNNoneNoneNaN
69806698074496931449693_1CA0371100SD12021-06-3015:00:003001Patrol, traffic enforcement, field operationsNaN200.0NaNVia De San YsidroNaN0NaNSAN YSIDRO712San Ysidro 7120030Male01NaNNohispNoneTraffic Violation54649.024603(D) VC - STOPLAMPS:VEH 2 REQUIRED (I) 54649Equipment ViolationSubject had a brakeligh that was out.Condition of parole / probation/ PRCS / mandatory supervisionNaNNaNNaNNonePerson removed from vehicle by order|Curbside detention|Search of property was conducted|Search of person was conductedNA|NA|NA|NA
69807698084496932449693_2CA0371100SD12021-06-3015:00:003001Patrol, traffic enforcement, field operationsNaN200.0NaNVia De San YsidroNaN0NaNSAN YSIDRO712San Ysidro 7120030Female02NaNNohispNoneTraffic Violation54649.024603(D) VC - STOPLAMPS:VEH 2 REQUIRED (I) 54649Equipment ViolationSubject had a brakelight that was out.Condition of parole / probation/ PRCS / mandatory supervisionNaNNaNNaNNoneSearch of property was conducted|Person removed from vehicle by orderNA|NA
69808698094496941449694_1CA0371100SD12021-06-3023:30:00501Patrol, traffic enforcement, field operationsNaN5600.0NaNECBNaN0NaNSAN DIEGO821Rolando 8210025Male01NaNNowhiteNoneTraffic Violation54168.05204(A) VC - EXPIRED TABS/FAIL DISPLAY (I) 54168Non-moving Violation, including Registration Violationexpired reg over 6 monthsNaNNaNNaNNaNNoneNoneNaN
69809698104497011449701_1CA0371100SD12021-06-3021:36:15501Patrol, traffic enforcement, field operationsNaN500.0NaNSaturnNaN0NaNSAN DIEGO721Egger Highlands 7210050Male01NaNNoblackNoneReasonable Suspicion53130.0415(2) PC - LOUD/UNREASONABLE NOISE (I) 53130Matched suspect descriptionRadio call of large group of people at a vehicle making loud noise.NaNNaNNaNNaNNoneCurbside detentionNaN
69810698114497091449709_1CA0371100SD52021-06-3023:29:461001Patrol, traffic enforcement, field operationsNaN300.0NaN17th stNaN0NaNSAN DIEGO521East Village 5210040Male01NaNNowhiteNoneTraffic Violation54427.021800(D) VC - FAIL STOP/YIELD:INOP SIGN (I) 54427Moving Violationdidnt stop at stop signNaNNaNNaNNaNNoneNoneNaN
69811698124497161449716_1CA0371100SD12021-06-3023:45:002001Patrol, traffic enforcement, field operationsNaN4200.0NaNdel sol ctNaN0NaNSAN DIEGO723Otay Mesa West 7230030Male01NaNNohispNoneReasonable Suspicion99999.0NA - XX AA - CODE NOT FOUND IN TABLE (X) 99999Matched suspect description415 - subj and later found to be in uncles drivewayNaNNaNNaNNaNNonePerson removed from vehicle by orderNaN
69812698134497261449726_1CA0371100SD112021-06-3015:54:001201Patrol, traffic enforcement, field operations15 SOUTH / AERO DRIVENaNNaNNaNNaN0NaNSAN DIEGO313Kearney Mesa 3130040Female02NaNNohispNoneTraffic Violation54566.023123(A) VC - USE CELLPH W/DRIV W/O HFD (I) 54566Moving ViolationCELL PHONENaNNaNNaNNaNNoneNoneNaN
69813698144499331449933_1CA0371100SD12021-06-3017:45:0012001Patrol, traffic enforcement, field operationsNaN4200.0NaNmISSION BLVDNaN0NaNSAN DIEGO122Pacific Beach 1220030Male01NaNNohispNoneReasonable Suspicion13219.0245(A)(1) PC - ADW NOT FIREARM (F) 13219Matched suspect descriptionRADIO CALL REGARDING A MALE HITTING ANOTHER MALE WITH A CROWBARIncident to arrest245PCNaNNaNNoneHandcuffed or flex cuffed|Search of person was conducted|Curbside detention|Patrol car detentionNA|NA|NA|NA